dcn10_get_dig_frontend problem like this fixed in "drm/amd/display: Add get_dig_frontend implementation for DCEx"

2021-01-13 Thread Andreas Hartmann
Hello,

I'm facing probably a similar problem on this machine during resume after s2ram 
with linux 5.10.7 (see attached file "trace").

The error happens in 
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_link_encoder.c:483 (see 
-line):

unsigned int dcn10_get_dig_frontend(struct link_encoder *enc)
{
struct dcn10_link_encoder *enc10 = TO_DCN10_LINK_ENC(enc);
int32_t value;
enum engine_id result;

REG_GET(DIG_BE_CNTL, DIG_FE_SOURCE_SELECT, &value);

switch (value) {
case DCN10_DIG_FE_SOURCE_SELECT_DIGA:
result = ENGINE_ID_DIGA;
break;
case DCN10_DIG_FE_SOURCE_SELECT_DIGB:
result = ENGINE_ID_DIGB;
break;
case DCN10_DIG_FE_SOURCE_SELECT_DIGC:
result = ENGINE_ID_DIGC;
break;
case DCN10_DIG_FE_SOURCE_SELECT_DIGD:
result = ENGINE_ID_DIGD;
break;
case DCN10_DIG_FE_SOURCE_SELECT_DIGE:
result = ENGINE_ID_DIGE;
break;
case DCN10_DIG_FE_SOURCE_SELECT_DIGF:
result = ENGINE_ID_DIGF;
break;
case DCN10_DIG_FE_SOURCE_SELECT_DIGG:
result = ENGINE_ID_DIGG;
break;
default:
// invalid source select DIG
ASSERT(false);
result = ENGINE_ID_UNKNOWN;
^^^
}

return result;
}


About the machine:
It's a notebook with two GPUs. AMD is the primary GPU - the secondary GPU 
(Nvidia) is unused (nouveau is not loaded at all - the proprietary driver isn't 
even installed)

05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Picasso (rev c1) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Device 18f1
Flags: bus master, fast devsel, latency 0, IRQ 24
Memory at e000 (64-bit, prefetchable) [size=256M]
Memory at f000 (64-bit, prefetchable) [size=2M]
I/O ports at c000 [size=256]
Memory at f750 (32-bit, non-prefetchable) [size=512K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
Capabilities: [c0] MSI-X: Enable+ Count=3 Masked-
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [200] #15
Capabilities: [270] #19
Capabilities: [2a0] Access Control Services
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Kernel driver in use: amdgpu
Kernel modules: amdgpu

01:00.0 VGA compatible controller: NVIDIA Corporation TU117M [GeForce GTX 1650 
Mobile / Max-Q] (rev a1) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Device 109f
Flags: fast devsel, IRQ 255
Memory at f600 (32-bit, non-prefetchable) [disabled] [size=16M]
Memory at c000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at d000 (64-bit, prefetchable) [disabled] [size=32M]
I/O ports at f000 [disabled] [size=128]
Expansion ROM at f700 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting 
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 

Capabilities: [900] #19
Capabilities: [bb0] #15
Kernel modules: nouveau

CPU: AMD Ryzen 7 3750H with Radeon Vega Mobile Gfx

Could you please fix this problem, too?
Please CC me for any answer because I'm not regularly reading the kernel 
mailing list.


Thanks
Andreas Hartmann
2021-01-13T10:52:02.135202+01:00 localhost kernel: [  155.645178] [ 
cut here ]
2021-01-13T10:52:02.135204+01:00 localhost kernel: [  155.645330] WARNING: CPU: 
6 PID: 4116 at 
../drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_link_encoder.c:483 
dcn10_get_dig_frontend+0x65/0xb0 [amdgpu]
2021-01-13T10:52:02.135205+01:00 localhost kernel: [  155.645331] Modules 
linked in: fuse iptable_mangle xt_TCPMSS xt_tcpudp bpfilter ip_tables x_tables 
af_packet dmi_sysfs uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 
videobuf2_common videodev mc msr snd_hda_codec_realtek snd_hda_codec_generic

Re: [PATCH 4.19 13/99] netfilter: nf_conncount: fix argument order to find_next_bit

2019-04-22 Thread Andreas Hartmann
On 22.04.19 at 20:57 Florian Westphal wrote:
> Andreas Hartmann  wrote:
>>> Could you at least tell us how you're using nf_conncount (nf/iptables
>>> rules)?
>>
>> # Generated by iptables-save v1.6.2 on Mon Apr 22 20:19:30 2019
>> *filter
>> :INPUT DROP [0:0]
>> :FORWARD ACCEPT [0:0]
>> :OUTPUT DROP [4423:248703]
>> -A INPUT -s 127.0.0.1/32 -d 239.255.255.250/32 -i lo -p udp -j ACCEPT
>> -A INPUT -p tcp -m tcp --dport 113 -j REJECT --reject-with 
>> icmp-port-unreachable
>> -A INPUT -d 255.255.255.255/32 -p udp -j ACCEPT
>> -A INPUT -d 224.0.0.1/32 -j ACCEPT
>> -A INPUT -s 127.0.0.1/32 -d 127.0.0.2/32 -i lo -j ACCEPT
>> -A INPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -i lo -j ACCEPT
>> -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
>> -A INPUT -s 192.168.22.0/24 -j ACCEPT
>> -A INPUT -j LOG --log-prefix "In Input gesperrt: "
>> -A INPUT -s 169.254.2.1/32 -d 169.254.2.2/32 -i br1 -p tcp -m tcp --sport 80 
>> -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -d 224.0.0.22/32 -o lo -p igmp -j ACCEPT
>> -A OUTPUT -d 192.168.6.173/32 -o br1 -p tcp -m tcp --dport 80 -j ACCEPT
>> -A OUTPUT -s 169.254.2.2/32 -d 239.255.255.250/32 -o br1 -p udp -j DROP
>> -A OUTPUT -s 192.168.22.6/32 -d 224.0.0.251/32 -o br1 -p udp -j ACCEPT
>> -A OUTPUT -s 127.0.0.1/32 -d 239.255.255.250/32 -o lo -p udp -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -d 255.255.255.255/32 -o br1 -p udp -m udp 
>> --dport 1900 -j ACCEPT
>> -A OUTPUT -s 127.0.0.1/32 -d 127.255.255.255/32 -o br1 -p udp -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -d 239.0.0.250/32 -o br1 -p igmp -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -d 239.255.255.250/32 -o br1 -p igmp -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -d 239.255.255.250/32 -o br1 -p udp -m udp 
>> --dport 1900 -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -d 239.1.1.1/32 -o br1 -p udp -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -d 239.1.1.1/32 -o br1 -p igmp -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -d 224.0.0.251/32 -o br1 -p igmp -j ACCEPT
>> -A OUTPUT -s 192.168.22.6/32 -p tcp -m tcp --dport 1935 -j ACCEPT
>> -A OUTPUT -s 192.168.22.0/24 -d 192.168.3.0/24 -j ACCEPT
>> -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.2/32 -o lo -j ACCEPT
>> -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -o lo -j ACCEPT
>> -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
>> -A OUTPUT -s 192.168.22.0/24 -d 192.168.22.0/24 -j ACCEPT
>> -A OUTPUT -j LOG --log-prefix "In Output gesperrt: "
>> -A OUTPUT -s 169.254.2.2/32 -d 169.254.2.1/32 -o br1 -p tcp -m tcp --dport 
>> 80 -j ACCEPT
>> COMMIT
> 
> I don't see connlimit match is in use.
> 
> Could you post output of
> 
> lsmod | grep nf_conncount
> 
> and
> 
> grep CONNCOUNT ~/your_kernel_conf

True - it's not in use (it's not even configured) at all. I'm surprised that it 
seems to fix the problem anyway.
Ok - I'm testing few weeks more. If it comes up again: this has been a false 
positive.
If I can't see it any more - I wouldn't know what to do any further at the 
moment.

Regarding git bisect, the only other possible remaining changes would be at the 
moment

tty: Don't hold ldisc lock in tty_reopen() if ldisc present 
Dmitry Safonov
tty: Simplify tty->count math in tty_reopen()   
Dmitry Safonov
tty: Hold tty_ldisc_lock() during tty_reopen()  
Dmitry Safonov
tty/ldsem: Wake up readers after timed out down_write() 
Dmitry Safonov

But I don't know how this change could break video streaming using serviio ... .


Thanks
Andreas


Re: [PATCH 4.19 13/99] netfilter: nf_conncount: fix argument order to find_next_bit

2019-04-22 Thread Andreas Hartmann
On 22.04.19 at 19:27 Florian Westphal wrote:
> Andreas Hartmann  wrote:
>> Since 4.19.17, I'm facing problems during streaming of videos I've never 
>> seen before. This means:
>>
>> - video from internet stutters although enough data flow can be seen in bmon.
>> - gpu is locked:
>>   radeon :0a:00.0: ring 0 stalled for more than 14084msec
>>   radeon :0a:00.0: GPU lockup (current fence id 0x00053ed7 last 
>> fence id 0x00053f0f on ring 0)
>> - The connection of videos streamed locally by the machine for a TV suddenly 
>> breaks (upnp - serviio as server).
>>
>> After very long time of testing, I detected, that removing the complete 
>> patch series for 4.19.17 regarding netfilter: nf_conncount makes the problem 
>> disappear.
>>
>> Please remove / fix this patchset!
> 
> The state in 4.19.y is same as in mainline.

I don't use mainline.

> Could you at least tell us how you're using nf_conncount (nf/iptables
> rules)?

The host is a Ryzen 7 1700 CPU, containing 4 kvm VMs, one ethernet interface, 2 
bridges (2 different networks). One VM works as a bridge between both networks.
The iptables rules are the following:

# Generated by iptables-save v1.6.2 on Mon Apr 22 20:19:30 2019
*filter
:INPUT DROP [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT DROP [4423:248703]
-A INPUT -s 127.0.0.1/32 -d 239.255.255.250/32 -i lo -p udp -j ACCEPT
-A INPUT -p tcp -m tcp --dport 113 -j REJECT --reject-with icmp-port-unreachable
-A INPUT -d 255.255.255.255/32 -p udp -j ACCEPT
-A INPUT -d 224.0.0.1/32 -j ACCEPT
-A INPUT -s 127.0.0.1/32 -d 127.0.0.2/32 -i lo -j ACCEPT
-A INPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -i lo -j ACCEPT
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -s 192.168.22.0/24 -j ACCEPT
-A INPUT -j LOG --log-prefix "In Input gesperrt: "
-A INPUT -s 169.254.2.1/32 -d 169.254.2.2/32 -i br1 -p tcp -m tcp --sport 80 -j 
ACCEPT
-A OUTPUT -s 192.168.22.6/32 -d 224.0.0.22/32 -o lo -p igmp -j ACCEPT
-A OUTPUT -d 192.168.6.173/32 -o br1 -p tcp -m tcp --dport 80 -j ACCEPT
-A OUTPUT -s 169.254.2.2/32 -d 239.255.255.250/32 -o br1 -p udp -j DROP
-A OUTPUT -s 192.168.22.6/32 -d 224.0.0.251/32 -o br1 -p udp -j ACCEPT
-A OUTPUT -s 127.0.0.1/32 -d 239.255.255.250/32 -o lo -p udp -j ACCEPT
-A OUTPUT -s 192.168.22.6/32 -d 255.255.255.255/32 -o br1 -p udp -m udp --dport 
1900 -j ACCEPT
-A OUTPUT -s 127.0.0.1/32 -d 127.255.255.255/32 -o br1 -p udp -j ACCEPT
-A OUTPUT -s 192.168.22.6/32 -d 239.0.0.250/32 -o br1 -p igmp -j ACCEPT
-A OUTPUT -s 192.168.22.6/32 -d 239.255.255.250/32 -o br1 -p igmp -j ACCEPT
-A OUTPUT -s 192.168.22.6/32 -d 239.255.255.250/32 -o br1 -p udp -m udp --dport 
1900 -j ACCEPT
-A OUTPUT -s 192.168.22.6/32 -d 239.1.1.1/32 -o br1 -p udp -j ACCEPT
-A OUTPUT -s 192.168.22.6/32 -d 239.1.1.1/32 -o br1 -p igmp -j ACCEPT
-A OUTPUT -s 192.168.22.6/32 -d 224.0.0.251/32 -o br1 -p igmp -j ACCEPT
-A OUTPUT -s 192.168.22.6/32 -p tcp -m tcp --dport 1935 -j ACCEPT
-A OUTPUT -s 192.168.22.0/24 -d 192.168.3.0/24 -j ACCEPT
-A OUTPUT -s 127.0.0.1/32 -d 127.0.0.2/32 -o lo -j ACCEPT
-A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -o lo -j ACCEPT
-A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A OUTPUT -s 192.168.22.0/24 -d 192.168.22.0/24 -j ACCEPT
-A OUTPUT -j LOG --log-prefix "In Output gesperrt: "
-A OUTPUT -s 169.254.2.2/32 -d 169.254.2.1/32 -o br1 -p tcp -m tcp --dport 80 
-j ACCEPT
COMMIT
# Completed on Mon Apr 22 20:19:30 2019


br0: flags=4163  mtu 9000
ether .  txqueuelen 1000  (Ethernet)
RX packets 1376  bytes 139220 (135.9 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 0  bytes 0 (0.0 B)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br1: flags=4163  mtu 1512
inet 192.168.22.6  netmask 255.255.255.0  broadcast 192.168.22.255
ether .  txqueuelen 1000  (Ethernet)
RX packets 1161816  bytes 2806028482 (2.6 GiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 1427306  bytes 2032637199 (1.8 GiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163  mtu 1512
ether .  txqueuelen 1000  (Ethernet)
RX packets 119990  bytes 110191277 (105.0 MiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 204094  bytes 234832004 (223.9 MiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
device interrupt 36  memory 0xfc8c-fc8e

lo: flags=73  mtu 65536
inet 127.0.0.1  netmask 255.0.0.0
loop  txqueuelen 1000  (Local Loopback)
RX packets 2474  bytes 16626724 (15.8 MiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 2474  bytes 16626724 (15.8 MiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

tap0: flags=4163  mtu 1512
ether   txqueuelen 1000 

Re: [PATCH 4.19 13/99] netfilter: nf_conncount: fix argument order to find_next_bit

2019-04-22 Thread Andreas Hartmann
Hello!

Since 4.19.17, I'm facing problems during streaming of videos I've never seen 
before. This means:

- video from internet stutters although enough data flow can be seen in bmon.
- gpu is locked:
  radeon :0a:00.0: ring 0 stalled for more than 14084msec
  radeon :0a:00.0: GPU lockup (current fence id 0x00053ed7 last 
fence id 0x00053f0f on ring 0)
- The connection of videos streamed locally by the machine for a TV suddenly 
breaks (upnp - serviio as server).

After very long time of testing, I detected, that removing the complete patch 
series for 4.19.17 regarding netfilter: nf_conncount makes the problem 
disappear.

Please remove / fix this patchset!


Thanks
Andreas Hartmann


On 21.01.19 at 14:48 Greg Kroah-Hartman wrote:
> 4.19-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Florian Westphal 
> 
> commit a007232066f6839d6f256bab21e825d968f1a163 upstream.
> 
> Size and 'next bit' were swapped, this bug could cause worker to
> reschedule itself even if system was idle.
> 
> Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, 
> and RCU for init tree search")
> Reviewed-by: Shawn Bohrer 
> Signed-off-by: Florian Westphal 
> Signed-off-by: Pablo Neira Ayuso 
> Signed-off-by: Greg Kroah-Hartman 
> 
> ---
>  net/netfilter/nf_conncount.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/net/netfilter/nf_conncount.c
> +++ b/net/netfilter/nf_conncount.c
> @@ -488,7 +488,7 @@ next:
>   clear_bit(tree, data->pending_trees);
>  
>   next_tree = (tree + 1) % CONNCOUNT_SLOTS;
> - next_tree = find_next_bit(data->pending_trees, next_tree, 
> CONNCOUNT_SLOTS);
> + next_tree = find_next_bit(data->pending_trees, CONNCOUNT_SLOTS, 
> next_tree);
>  
>   if (next_tree < CONNCOUNT_SLOTS) {
>   data->gc_tree = next_tree;
> 
> 
> 





Re: Spectre mitigation doesn't seem to work at all?!

2018-06-04 Thread Andreas Hartmann
On 06/04/2018 at 04:12 PM Alan Cox wrote:
>> A malicious program most probably won't care about that. Therefore, my
>> next question is: which memory regions can be exploited by a malicious
>> program? The complete physical memory or only the memory provided to the
>> malicious program? Should be the latter if this approach should have any
>> impact.
> 
> Spectre is not about memory regions. It's about speculative execution
> leaving measurable footprints. What footprints you leave depend upon what
> code you are executing. Thus the question becomes 'what can the target
> access'.
> 
> In order to attack something you need both a way to influence the code
> concerned and a way to measure it. In addition it needs to have some
> secret you want.
> 
> In practice that usually means something on the same system with its own
> memory space/privilege level. The usual cases then are user<->kernel and
> managed application<->runtime.

Would this be a practical test case: Gather keys and passwords used by a
ssh login by running a malicious program in parallel to sshd as another
ordinary user w/o root access.


Thanks,
Andreas


Re: Spectre mitigation doesn't seem to work at all?!

2018-06-04 Thread Andreas Hartmann
Hello Mark,

On 06/04/2018 at 11:19 AM Mark Rutland wrote:
> On Mon, Jun 04, 2018 at 10:50:07AM +0200, Andreas Hartmann wrote:
>> Hello Peter,
>>
>> thanks for your answer! I appreciate it!
>>
>> On 06/04/2018 at 10:15 AM Peter Zijlstra wrote:
>>> On Fri, Jun 01, 2018 at 02:19:38PM +0200, Andreas Hartmann wrote:
>>>
>>>> I tested the spectre mitigation of different machines and kernels with
>>>> https://github.com/crozone/SpectrePoC
>>>>
>>>> You can see the results below.
>>>
>>>> My question: Did I miss something?
>>>
>>> Yes.
>>>
>>>> Build: ... INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
>>>> Build: ... INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
>>>> Build: ... INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
>>>
>>>    
>>>
>>> The POC is a v1 on itself. V1 needs to be fixed for every individual
>>> executable (worse, for every individual location in the code, and we're
>>> still finding them). The kernel mitigation status for v1 only indicates
>>> the kernel itself has mitigations (for some locations).
>>>
>>> The POC is meant to test effectiveness of these mitigations, either the
>>> original LFENCE or the dependent instruction thing, but you have to
>>> enable one or the other.
>>
>> Ok, this means every program running on the machine has to care itself
>> to be spectre v1 - safe.
> 
> Correct. Primiarily this matters for things like JITs, where untrusted code 
> may
> be run in the same address space as sensitive data.
> 
>> A malicious program most probably won't care about that. Therefore, my
>> next question is: which memory regions can be exploited by a malicious
>> program? The complete physical memory or only the memory provided to the
>> malicious program? Should be the latter if this approach should have any
>> impact.
> 
> Assuming you have a CPU which is not vulnerable to meltdown / variant-3, or 
> you
> have mitigated this, (e.g. with KPTI), a malicious program can only access 
> data
> within its own address space.
> 
> Spectre variant-1 alone only gives access to memory in the address space of 
> the
> program itself.

Thanks Mark! Now I've a better understanding about the effects the
different vulnerabilities around Spectre and Meltdown do have and I'm
now hopefully able to better estimate them.

As I'm mostly using AMD-CPUs (like Ryzen 1 e.g.) for virtualization, I
should be secure by default regarding unwanted global memory access from
the VM to the host memory, because the Ryzen 1 CPU is not affected by
Meltdown at all.


Regards,
Andreas


Re: Spectre mitigation doesn't seem to work at all?!

2018-06-04 Thread Andreas Hartmann
Hello Peter,

thanks for your answer! I appreciate it!

On 06/04/2018 at 10:15 AM Peter Zijlstra wrote:
> On Fri, Jun 01, 2018 at 02:19:38PM +0200, Andreas Hartmann wrote:
> 
>> I tested the spectre mitigation of different machines and kernels with
>> https://github.com/crozone/SpectrePoC
>>
>> You can see the results below.
> 
>> My question: Did I miss something?
> 
> Yes.
> 
>> Build: ... INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
>> Build: ... INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
>> Build: ... INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
> 
>    
> 
> The POC is a v1 on itself. V1 needs to be fixed for every individual
> executable (worse, for every individual location in the code, and we're
> still finding them). The kernel mitigation status for v1 only indicates
> the kernel itself has mitigations (for some locations).
> 
> The POC is meant to test effectiveness of these mitigations, either the
> original LFENCE or the dependent instruction thing, but you have to
> enable one or the other.

Ok, this means every program running on the machine has to care itself
to be spectre v1 - safe.

A malicious program most probably won't care about that. Therefore, my
next question is: which memory regions can be exploited by a malicious
program? The complete physical memory or only the memory provided to the
malicious program? Should be the latter if this approach should have any
impact.


Thanks,
Andreas


Re: Spectre mitigation doesn't seem to work at all?!

2018-06-04 Thread Andreas Hartmann
Hello!

Sorry for a ping - but I think the behavior shown below should really be
investigated!


Thanks,
Andreas




On 06/01/2018 at 02:19 PM Andreas Hartmann wrote:
> Hello!
> 
> I tested the spectre mitigation of different machines and kernels with
> https://github.com/crozone/SpectrePoC
> 
> You can see the results below.
> 
> 
> My question: Did I miss something?
> My expectation was, that on base of the output of
> /sys/devices/system/cpu/vulnerabilities/spectre_v* as shown below the
> problem should be gone away.
> But the results seem to tell me something other ... .
> 
> 
> Thanks
> Andreas
> 
> 
> 
> 
> --
> 
> CPU:    AMD Ryzen 7 1700X Eight-Core Processor
> Bios:   BIOS 4011 04/19/2018 - ibpb is listed in /proc/cpuinfo
> Kernel: 4.14.44-1.1-default
> cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
> Mitigation: Full AMD retpoline, IBPB
> cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
> Mitigation: __user pointer sanitization
> 
>  ./spectre.out
> Using a cache hit threshold of 80.
> Build: RDTSCP_SUPPORTED MFENCE_SUPPORTED CLFLUSH_SUPPORTED
> INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
> Reading 40 bytes:
> Reading at malicious_x = 0xffdfec18... Success: 0x54=’T’ score=2
> Reading at malicious_x = 0xffdfec19... Success: 0x68=’h’ score=2
> Reading at malicious_x = 0xffdfec1a... Success: 0x65=’e’ score=2
> Reading at malicious_x = 0xffdfec1b... Success: 0x20=’ ’ score=2
> Reading at malicious_x = 0xffdfec1c... Success: 0x4D=’M’ score=2
> Reading at malicious_x = 0xffdfec1d... Success: 0x61=’a’ score=2
> Reading at malicious_x = 0xffdfec1e... Success: 0x67=’g’ score=2
> Reading at malicious_x = 0xffdfec1f... Success: 0x69=’i’ score=2
> Reading at malicious_x = 0xffdfec20... Success: 0x63=’c’ score=2
> Reading at malicious_x = 0xffdfec21... Success: 0x20=’ ’ score=2
> Reading at malicious_x = 0xffdfec22... Success: 0x57=’W’ score=2
> Reading at malicious_x = 0xffdfec23... Success: 0x6F=’o’ score=2
> Reading at malicious_x = 0xffdfec24... Success: 0x72=’r’ score=2
> Reading at malicious_x = 0xffdfec25... Success: 0x64=’d’ score=2
> Reading at malicious_x = 0xffdfec26... Success: 0x73=’s’ score=2
> Reading at malicious_x = 0xffdfec27... Success: 0x20=’ ’ score=2
> Reading at malicious_x = 0xffdfec28... Success: 0x61=’a’ score=2
> Reading at malicious_x = 0xffdfec29... Success: 0x72=’r’ score=2
> Reading at malicious_x = 0xffdfec2a... Success: 0x65=’e’ score=2
> Reading at malicious_x = 0xffdfec2b... Success: 0x20=’ ’ score=2
> Reading at malicious_x = 0xffdfec2c... Success: 0x53=’S’ score=2
> Reading at malicious_x = 0xffdfec2d... Success: 0x71=’q’ score=2
> Reading at malicious_x = 0xffdfec2e... Success: 0x75=’u’ score=2
> Reading at malicious_x = 0xffdfec2f... Success: 0x65=’e’ score=2
> Reading at malicious_x = 0xffdfec30... Success: 0x61=’a’ score=2
> Reading at malicious_x = 0xffdfec31... Success: 0x6D=’m’ score=2
> Reading at malicious_x = 0xffdfec32... Success: 0x69=’i’ score=2
> Reading at malicious_x = 0xffdfec33... Success: 0x73=’s’ score=2
> Reading at malicious_x = 0xffdfec34... Success: 0x68=’h’ score=2
> Reading at malicious_x = 0xffdfec35... Success: 0x20=’ ’ score=2
> Reading at malicious_x = 0xffdfec36... Success: 0x4F=’O’ score=2
> Reading at malicious_x = 0xffdfec37... Success: 0x73=’s’ score=2
> Reading at malicious_x = 0xffdfec38... Success: 0x73=’s’ score=2
> Reading at malicious_x = 0xffdfec39... Success: 0x69=’i’ score=2
> Reading at malicious_x = 0xffdfec3a... Success: 0x66=’f’ score=2
> Reading at malicious_x = 0xffdfec3b... Success: 0x72=’r’ score=2
> Reading at malicious_x = 0xffdfec3c... Success: 0x61=’a’ score=2
> Reading at malicious_x = 0xffdfec3d... Success: 0x67=’g’ score=2
> Reading at malicious_x = 0xffdfec3e... Success: 0x65=’e’ score=2
> Reading at malicious_x = 0xffdfec3f... Success: 0x2E=’.’ score=2
> 
> 
> --
> 
> CPU:    AMD G-T40E Processor
> Kernel: 4.14.44-1.el6.x86_64
> cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
> Mitigation: __user pointer sanitization
> cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
> Mitigation: Full AMD retpoline
> 
> ./spectre.out 130
> Using a cache hit threshold of 130.

Spectre mitigation doesn't seem to work at all?!

2018-06-01 Thread Andreas Hartmann

Hello!

I tested the spectre mitigation of different machines and kernels with
https://github.com/crozone/SpectrePoC

You can see the results below.


My question: Did I miss something?
My expectation was, that on base of the output of
/sys/devices/system/cpu/vulnerabilities/spectre_v* as shown below the problem 
should be gone away.
But the results seem to tell me something other ... .


Thanks
Andreas




--
CPU:AMD Ryzen 7 1700X Eight-Core Processor
Bios:   BIOS 4011 04/19/2018 - ibpb is listed in /proc/cpuinfo
Kernel: 4.14.44-1.1-default
cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
Mitigation: Full AMD retpoline, IBPB
cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
Mitigation: __user pointer sanitization

 ./spectre.out
Using a cache hit threshold of 80.
Build: RDTSCP_SUPPORTED MFENCE_SUPPORTED CLFLUSH_SUPPORTED 
INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
Reading 40 bytes:
Reading at malicious_x = 0xffdfec18... Success: 0x54=’T’ score=2
Reading at malicious_x = 0xffdfec19... Success: 0x68=’h’ score=2
Reading at malicious_x = 0xffdfec1a... Success: 0x65=’e’ score=2
Reading at malicious_x = 0xffdfec1b... Success: 0x20=’ ’ score=2
Reading at malicious_x = 0xffdfec1c... Success: 0x4D=’M’ score=2
Reading at malicious_x = 0xffdfec1d... Success: 0x61=’a’ score=2
Reading at malicious_x = 0xffdfec1e... Success: 0x67=’g’ score=2
Reading at malicious_x = 0xffdfec1f... Success: 0x69=’i’ score=2
Reading at malicious_x = 0xffdfec20... Success: 0x63=’c’ score=2
Reading at malicious_x = 0xffdfec21... Success: 0x20=’ ’ score=2
Reading at malicious_x = 0xffdfec22... Success: 0x57=’W’ score=2
Reading at malicious_x = 0xffdfec23... Success: 0x6F=’o’ score=2
Reading at malicious_x = 0xffdfec24... Success: 0x72=’r’ score=2
Reading at malicious_x = 0xffdfec25... Success: 0x64=’d’ score=2
Reading at malicious_x = 0xffdfec26... Success: 0x73=’s’ score=2
Reading at malicious_x = 0xffdfec27... Success: 0x20=’ ’ score=2
Reading at malicious_x = 0xffdfec28... Success: 0x61=’a’ score=2
Reading at malicious_x = 0xffdfec29... Success: 0x72=’r’ score=2
Reading at malicious_x = 0xffdfec2a... Success: 0x65=’e’ score=2
Reading at malicious_x = 0xffdfec2b... Success: 0x20=’ ’ score=2
Reading at malicious_x = 0xffdfec2c... Success: 0x53=’S’ score=2
Reading at malicious_x = 0xffdfec2d... Success: 0x71=’q’ score=2
Reading at malicious_x = 0xffdfec2e... Success: 0x75=’u’ score=2
Reading at malicious_x = 0xffdfec2f... Success: 0x65=’e’ score=2
Reading at malicious_x = 0xffdfec30... Success: 0x61=’a’ score=2
Reading at malicious_x = 0xffdfec31... Success: 0x6D=’m’ score=2
Reading at malicious_x = 0xffdfec32... Success: 0x69=’i’ score=2
Reading at malicious_x = 0xffdfec33... Success: 0x73=’s’ score=2
Reading at malicious_x = 0xffdfec34... Success: 0x68=’h’ score=2
Reading at malicious_x = 0xffdfec35... Success: 0x20=’ ’ score=2
Reading at malicious_x = 0xffdfec36... Success: 0x4F=’O’ score=2
Reading at malicious_x = 0xffdfec37... Success: 0x73=’s’ score=2
Reading at malicious_x = 0xffdfec38... Success: 0x73=’s’ score=2
Reading at malicious_x = 0xffdfec39... Success: 0x69=’i’ score=2
Reading at malicious_x = 0xffdfec3a... Success: 0x66=’f’ score=2
Reading at malicious_x = 0xffdfec3b... Success: 0x72=’r’ score=2
Reading at malicious_x = 0xffdfec3c... Success: 0x61=’a’ score=2
Reading at malicious_x = 0xffdfec3d... Success: 0x67=’g’ score=2
Reading at malicious_x = 0xffdfec3e... Success: 0x65=’e’ score=2
Reading at malicious_x = 0xffdfec3f... Success: 0x2E=’.’ score=2


--
CPU:AMD G-T40E Processor
Kernel: 4.14.44-1.el6.x86_64
cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
Mitigation: __user pointer sanitization
cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
Mitigation: Full AMD retpoline

./spectre.out 130
Using a cache hit threshold of 130.
Build: RDTSCP_SUPPORTED MFENCE_SUPPORTED CLFLUSH_SUPPORTED 
INTEL_MITIGATION_DISABLED LINUX_KERNEL_MITIGATION_DISABLED
Reading 40 bytes:
Reading at malicious_x = 0xffdfebf0... Unclear: 0x54=’T’ score=999 
(second best: 0x00=’?’ score=992)
Reading at malicious_x = 0xffdfebf1... Unclear: 0x68=’h’ score=996 
(second best: 0x00=’?’ score=988)
Reading at malicious_x = 0xffdfebf2... Unclear: 0x65=’e’ score=999 
(second best: 0x00=’?’ score=985)
Reading at malicious_x = 0xffdfebf3... Unclear: 0x20=’ ’ score=997 
(second best: 0x00=’?’ score=989)
Reading at malicious_x = 0xffdfebf4

Re: [FYI] GCC segfaults under heavy multithreaded compilation with AMD Ryzen

2017-07-31 Thread Andreas Hartmann
On 07/31/2017 at 02:10 PM Alan Cox wrote:
> On Wed, 26 Jul 2017 06:54:01 +0900
> Satoru Takeuchi  wrote:
> 
>> # I'm a LKML subscriber, but not a x86 list subscriber
>>
>> I found the following new linux kernel bugzilla about Ryzen related problem.
>> Since many developers don't check this bugzilla and I've also
>> encountered this problem,
>> I decided to introduce this problem here.
> 
> Historically we've seen exactly these symptoms on all kinds of systems
> where the memory is at fault, even in cases where memtest86 passes.
> Whether there's a specific problem on some Ryzen boards is a question for
> AMD, but if I saw this without knowing the CPU I'd suspect memory
> firstly. GCC it turns out is by accident an amazingly effective memory
> testing tool.

That's surely true. But meanwhile, I got rid of my memory problems (no
more traces like these [1] or even system hangs) by a correct memory
configuration, but the segfaults of gcc remain, most of the time with
kernel 4.12, kernel 4.11.x and 4.9.39ff are working mostly fine -
mostly, because I stopped tests and can't therefore say, if it's really
stable or not - but (k)aslr must be disabled always.

FreeBSD meanwhile provides this workaround after long research [2]:

https://reviews.freebsd.org/D11780

Please port it to Linux!


[1] https://www.spinics.net/lists/kernel/msg2565491.html
[2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399#c89


Re: [FYI] GCC segfaults under heavy multithreaded compilation with AMD Ryzen

2017-07-25 Thread Andreas Hartmann
On 07/26/2017 at 12:00 AM Satoru Takeuchi wrote:
> # I'm a LKML subscriber, but not a x86 list subscriber
> 
> I found the following new linux kernel bugzilla about Ryzen related problem.
> Since many developers don't check this bugzilla and I've also
> encountered this problem,
> I decided to introduce this problem here.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=196481:

I'm affected, too.

I'm using Asus PRIME X370-PRO / 32 GB RAM (Kingston Hyperx
HX424C15FBK2/32) configured as suggested by bios 0805 (Agesa 1.0.0.6):
2400 MHz.
Problems happen with linux 4.12.x or 4.9.x (didn't test others).

It seems that things may run more stable if the machine is booted twice:
At first boot until password request for hd encryption, then hard reset again.


During kernel compiling, I can see those crashes and hard lockups, ...:

  CC [M]  drivers/video/backlight/adp8870_bl.o
  CC [M]  drivers/usb/host/r8a66597-hcd.o
../scripts/Makefile.build:315: recipe for target
'drivers/usb/host/r8a66597-hcd.o' failed
../drivers/usb/host/r8a66597-hcd.c: In function 'r8a66597_timer':
../drivers/usb/host/r8a66597-hcd.c:1824:1: internal compiler error:
Segmentation fault
 }
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
make[5]: *** [drivers/usb/host/r8a66597-hcd.o] Error 1
../scripts/Makefile.build:568: recipe for target 'drivers/usb/host' failed
make[4]: *** [drivers/usb/host] Error 2
make[4]: *** Waiting for unfinished jobs
  CC [M]  drivers/scsi/lpfc/lpfc_mbox.o


or


  CC [M]  drivers/staging/lustre/lustre/obdclass/lustre_handles.o
  CC [M]  drivers/net/ethernet/intel/e1000e/82571.o
../scripts/Makefile.build:309: recipe for target
'drivers/net/ethernet/intel/e1000e/82571.o' failed
../drivers/net/ethernet/intel/e1000e/82571.c: In function
'e1000_init_hw_82571':
../drivers/net/ethernet/intel/e1000e/82571.c:1152:1: internal compiler
error: Segmentation fault
 }
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
make[7]: *** [drivers/net/ethernet/intel/e1000e/82571.o] Error 1
../scripts/Makefile.build:568: recipe for target
'drivers/net/ethernet/intel/e1000e' failed
  CC [M]  drivers/scsi/fcoe/fcoe.o
make[6]: *** [drivers/net/ethernet/intel/e1000e] Error 2
make[6]: *** Waiting for unfinished jobs


It also happened, that compiling just hangs, because two processes wait for 
each other.


Sometimes I get those entries in messages:


Jul 25 17:08:03 dualc kernel: traps: cc1[17305] general protection ip:48960c 
sp:7fff9910 error:0
Jul 25 17:08:03 dualc kernel:  in cc1[40+c73000]
Jul 25 17:08:03 dualc kernel: Modules linked in: vhost_net tun vhost macvtap 
macvlan igb dca nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 
xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack 
iptable_filter ip_tables x_tables vfio_pci vfio_iommu_type1 vfio_virqfd vfio 
br_netfilter bridge stp llc iscsi_ibft iscsi_boot_sysfs it87(O) hwmon_vid 
snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel eeepc_wmi asus_wmi 
snd_hda_codec sparse_keymap rfkill snd_hda_core video snd_hwdep mxm_wmi snd_pcm 
snd_seq snd_seq_device kvm_amd snd_timer kvm irqbypass snd pcspkr e1000e 
sp5100_tco soundcore i2c_piix4 ptp pps_core acpi_cpufreq fjes tpm_tis 
gpio_amdpt 8250_dw gpio_generic pinctrl_amd i2c_designware_platform wmi 
tpm_tis_core shpchp i2c_designware_core button tpm nfsd auth_rpcgss nfs_acl 
lockd grace
Jul 25 17:08:03 dualc kernel:  sunrpc xfs libcrc32c dm_crypt hid_generic usbhid 
raid1 md_mod amdkfd amd_iommu_v2 radeon crct10dif_pclmul crc32_pclmul 
crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops ttm serio_raw drm ccp sr_mod cdrom xhci_pci 
xhci_hcd usbcore aesni_intel aes_x86_64 glue_helper lrw ablk_helper cryptd 
ata_generic pata_atiixp dm_mirror dm_region_hash dm_log sg thermal dm_multipath 
dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
Jul 25 17:08:03 dualc kernel: CPU: 9 PID: 17378 Comm: sh Tainted: G   O 
   4.9.39-2.4-default #1
Jul 25 17:08:03 dualc kernel: Hardware name: System manufacturer System Product 
Name/PRIME X370-PRO, BIOS 0805 06/20/2017
Jul 25 17:08:03 dualc kernel: task: 968a69f72140 task.stack: 
b69e9468
Jul 25 17:08:03 dualc kernel: RIP: 0010:[]  
[] lock_page_memcg+0x4f/0x80
Jul 25 17:08:03 dualc kernel: RSP: 0018:b69e94683c30  EFLAGS: 00010286
Jul 25 17:08:03 dualc kernel: RAX: 968a69f72140 RBX: cfff96844ca83000 RCX: 
007f8a32
Jul 25 17:08:03 dualc kernel: RDX: e459dfe28c80 RSI:  RDI: 
e459dfe28c80
Jul 25 17:08:03 dualc kernel: RBP: b69e94683c48 R08: 96880c194480 R09: 
77988000
Jul 25 17:08:03 dualc kernel: R10: 968abe85ccd8 R11: 00020002 R12: 
7795d000
Jul 25 17:08:03 dualc kernel: R13: e459dfe28c80 R14: b69e94683dd0 R15: 
7795e000
Jul 25 17:08:03 dualc kernel: FS: 

Regression - Linux 4.9: ums_eneub6250 broken: transfer buffer not dma capable - Trace

2017-04-15 Thread Andreas Hartmann
Hello!

Since Linux 4.9, ums_eneub6250 is broken. It's working fine if
CONFIG_VMAP_STACK is disabled.

I would be glad if it would be fixed.


Thanks,
kind regards,
Andreas


Apr 15 17:58:54 notebook2 kernel: usb 1-1.1: new high-speed USB device number 3 
using ehci-pci
Apr 15 17:58:54 notebook2 kernel: usb 1-1.1: New USB device found, 
idVendor=0cf2, idProduct=6250
Apr 15 17:58:54 notebook2 kernel: usb 1-1.1: New USB device strings: Mfr=1, 
Product=2, SerialNumber=4
Apr 15 17:58:54 notebook2 kernel: usb 1-1.1: Product: UB6250   
Apr 15 17:58:54 notebook2 kernel: usb 1-1.1: Manufacturer: ENE Flash  
Apr 15 17:58:54 notebook2 kernel: usb 1-1.1: SerialNumber: 606569746801
Apr 15 17:58:54 notebook2 mtp-probe[2134]: checking bus 1, device 3: 
"/sys/devices/pci:00/:00:1a.0/usb1/1-1/1-1.1"
Apr 15 17:58:54 notebook2 mtp-probe[2134]: bus: 1, device: 3 was not an MTP 
device
Apr 15 17:58:55 notebook2 kernel: usbcore: registered new interface driver 
usb-storage
Apr 15 17:58:55 notebook2 kernel: usbcore: registered new interface driver uas
Apr 15 17:58:55 notebook2 kernel: ums_eneub6250 1-1.1:1.0: USB Mass Storage 
device detected
Apr 15 17:58:55 notebook2 kernel: scsi host6: usb-storage 1-1.1:1.0
Apr 15 17:58:55 notebook2 kernel: [ cut here ]
Apr 15 17:58:55 notebook2 kernel: WARNING: CPU: 2 PID: 2133 at 
../drivers/usb/core/hcd.c:1587 usb_hcd_map_urb_for_dma+0x4ba/0x4f0 [usbcore]
Apr 15 17:58:55 notebook2 kernel: transfer buffer not dma capable
Apr 15 17:58:55 notebook2 kernel: Modules linked in: ums_eneub6250(+) uas 
usb_storage fuse binfmt_misc snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep 
snd_pcm_oss msi_wmi iTCO_wdt iTCO_vendor_support snd_pcm wmi snd_seq battery ac 
msi_laptop sparse_keymap rfkill joydev snd_seq_device snd_timer r8169 mii 
snd_mixer_oss intel_powerclamp coretemp kvm_intel snd mei_me mei kvm i2c_i801 
lpc_ich soundcore intel_ips shpchp mfd_core i2c_smbus fjes acpi_cpufreq tpm_tis 
pcspkr thermal tpm_tis_core tpm irqbypass fan dm_crypt crc32c_intel serio_raw 
sr_mod cdrom ehci_pci i915 ehci_hcd i2c_algo_bit usbcore drm_kms_helper 
syscopyarea sysfillrect sysimgblt fb_sys_fops drm video button dm_mirror 
dm_region_hash dm_log sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc 
scsi_dh_alua
Apr 15 17:58:55 notebook2 kernel: CPU: 2 PID: 2133 Comm: systemd-udevd Not 
tainted 4.9.21-1-default #1
Apr 15 17:58:55 notebook2 kernel: Hardware name: Micro-Star International 
CR620/CR620, BIOS E1681IMS VER.10C 04/12/2011
Apr 15 17:58:55 notebook2 kernel:  baf681b477f0 af3c854a 
baf681b47840 
Apr 15 17:58:55 notebook2 kernel:  baf681b47830 af085c71 
0633af0bd0de 8d35b2844e40
Apr 15 17:58:55 notebook2 kernel:   0200 
0002 8d360fafd800
Apr 15 17:58:55 notebook2 kernel: Call Trace:
Apr 15 17:58:55 notebook2 kernel:  [] dump_stack+0x63/0x89
Apr 15 17:58:55 notebook2 kernel:  [] __warn+0xd1/0xf0
Apr 15 17:58:55 notebook2 kernel:  [] 
warn_slowpath_fmt+0x4f/0x60
Apr 15 17:58:55 notebook2 kernel:  [] ? 
put_prev_entity+0x48/0x720
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_hcd_map_urb_for_dma+0x4ba/0x4f0 [usbcore]
Apr 15 17:58:55 notebook2 kernel:  [] ? 
finish_task_switch+0x78/0x1e0
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_hcd_submit_urb+0x1c9/0xb30 [usbcore]
Apr 15 17:58:55 notebook2 kernel:  [] ? schedule+0x3d/0x90
Apr 15 17:58:55 notebook2 kernel:  [] ? 
schedule_timeout+0x220/0x3c0
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_submit_urb.part.6+0x295/0x550 [usbcore]
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_submit_urb+0x34/0x70 [usbcore]
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_stor_msg_common+0x9d/0x120 [usb_storage]
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_stor_bulk_transfer_buf+0x56/0xa0 [usb_storage]
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_stor_bulk_transfer_sg+0x4e/0x60 [usb_storage]
Apr 15 17:58:55 notebook2 kernel:  [] 
ene_send_scsi_cmd+0x97/0x160 [ums_eneub6250]
Apr 15 17:58:55 notebook2 kernel:  [] 
ene_get_card_type.constprop.19+0x5b/0x60 [ums_eneub6250]
Apr 15 17:58:55 notebook2 kernel:  [] 
ene_ub6250_probe+0x8f/0x110 [ums_eneub6250]
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_probe_interface+0x157/0x2f0 [usbcore]
Apr 15 17:58:55 notebook2 kernel:  [] 
driver_probe_device+0x227/0x440
Apr 15 17:58:55 notebook2 kernel:  [] 
__driver_attach+0xdd/0xe0
Apr 15 17:58:55 notebook2 kernel:  [] ? 
driver_probe_device+0x440/0x440
Apr 15 17:58:55 notebook2 kernel:  [] 
bus_for_each_dev+0x5d/0x90
Apr 15 17:58:55 notebook2 kernel:  [] driver_attach+0x1e/0x20
Apr 15 17:58:55 notebook2 kernel:  [] 
bus_add_driver+0x45/0x270
Apr 15 17:58:55 notebook2 kernel:  [] 
driver_register+0x60/0xe0
Apr 15 17:58:55 notebook2 kernel:  [] 
usb_register_driver+0x82/0x150 [usbcore]
Apr 15 17:58:55 notebook2 kernel:  [] ? 0xc03b9000
Apr 15 17:58:55 notebook2 kernel:  [] 
ene_ub6250_driver_init+0x38/0x1000 [ums_eneub6250]

ata3.00: failed command: WRITE FPDMA QUEUED since Linux 4.1

2015-07-24 Thread Andreas Hartmann
Hello!

Since Linux 4.1, there are often ata erros like these here:

[1.154572] libata version 3.00 loaded.
[1.787436] ahci :00:11.0: AHCI 0001.0200 32 slots 6 ports 6 Gbps
0x3f impl SATA mode
[1.788731] ata1: SATA max UDMA/133 abar m1024@0xfdfff000 port
0xfdfff100 irq 19
[1.788733] ata2: SATA max UDMA/133 abar m1024@0xfdfff000 port
0xfdfff180 irq 19
[1.788734] ata3: SATA max UDMA/133 abar m1024@0xfdfff000 port
0xfdfff200 irq 19
[1.788736] ata4: SATA max UDMA/133 abar m1024@0xfdfff000 port
0xfdfff280 irq 19
[1.788738] ata5: SATA max UDMA/133 abar m1024@0xfdfff000 port
0xfdfff300 irq 19
[1.788740] ata6: SATA max UDMA/133 abar m1024@0xfdfff000 port
0xfdfff380 irq 19
[2.105906] ata4: SATA link down (SStatus 0 SControl 300)
[2.109960] ata6: SATA link down (SStatus 0 SControl 300)
[2.281699] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[2.281717] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[2.281737] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[2.281752] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[2.282446] ata2.00: ATA-8: ST3000DM001-1CH166, CC24, max UDMA/133
[2.282448] ata2.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth
31/32), AA
[2.282893] ata3.00: ATA-9: ST3000DM001-1CH166, CC29, max UDMA/133
[2.282895] ata3.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth
31/32), AA
[2.283287] ata2.00: configured for UDMA/133
[2.283763] ata3.00: configured for UDMA/133
[2.289800] ata5.00: ATAPI: HL-DT-ST BD-RE  BH10LS38, 1.00, max UDMA/133
[2.293383] ata1.00: ATA-8: Corsair Force GT, 1.3.3, max UDMA/133
[2.293385] ata1.00: 468862128 sectors, multi 16: LBA48 NCQ (depth
31/32), AA
[2.293754] ata5.00: configured for UDMA/133
[2.303356] ata1.00: configured for UDMA/133
[2.303469] scsi 0:0:0:0: Direct-Access ATA  Corsair Force GT
3PQ: 0 ANSI: 5
[2.303760] scsi 1:0:0:0: Direct-Access ATA  ST3000DM001-1CH1
CC24 PQ: 0 ANSI: 5
[2.304055] scsi 2:0:0:0: Direct-Access ATA  ST3000DM001-1CH1
CC29 PQ: 0 ANSI: 5
[   48.689195] ata3.00: exception Emask 0x0 SAct 0x18 SErr 0x0 action
0x6 frozen
[   48.690421] ata3.00: failed command: WRITE FPDMA QUEUED
[   48.691597] ata3.00: cmd 61/58:18:21:ab:eb/05:00:58:00:00/40 tag 3
ncq 700416 out
[   48.693977] ata3.00: status: { DRDY }
[   48.695115] ata3.00: failed command: WRITE FPDMA QUEUED
[   48.696257] ata3.00: cmd 61/00:20:21:a3:eb/08:00:58:00:00/40 tag 4
ncq 1048576 out
[   48.698702] ata3.00: status: { DRDY }
[   48.699856] ata3: hard resetting link
[   49.188612] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   49.385330] ata3.00: configured for UDMA/133
[   49.385356] ata3.00: device reported invalid CHS sector 0
[   49.385380] ata3.00: device reported invalid CHS sector 0
[   49.385393] ata3: EH complete
[   79.630109] ata3.00: exception Emask 0x0 SAct 0xc0 SErr 0x0 action
0x6 frozen
[   79.631069] ata3.00: failed command: WRITE FPDMA QUEUED
[   79.632057] ata3.00: cmd 61/00:30:21:a3:eb/08:00:58:00:00/40 tag 6
ncq 1048576 out
[   79.634185] ata3.00: status: { DRDY }
[   79.635267] ata3.00: failed command: WRITE FPDMA QUEUED
[   79.636378] ata3.00: cmd 61/58:38:21:ab:eb/05:00:58:00:00/40 tag 7
ncq 700416 out
[   79.638743] ata3.00: status: { DRDY }
[   79.639935] ata3: hard resetting link
[   80.129527] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   80.145631] ata3.00: configured for UDMA/133
[   80.145661] ata3.00: device reported invalid CHS sector 0
[   80.145680] ata3.00: device reported invalid CHS sector 0
[   80.145693] ata3: EH complete
[  110.571021] ata3.00: exception Emask 0x0 SAct 0x1800 SErr 0x0 action
0x6 frozen
[  110.572263] ata3.00: failed command: WRITE FPDMA QUEUED
[  110.573505] ata3.00: cmd 61/58:58:21:ab:eb/05:00:58:00:00/40 tag 11
ncq 700416 out
[  110.576028] ata3.00: status: { DRDY }
[  110.577267] ata3.00: failed command: WRITE FPDMA QUEUED
[  110.578508] ata3.00: cmd 61/00:60:21:a3:eb/08:00:58:00:00/40 tag 12
ncq 1048576 out
[  110.580954] ata3.00: status: { DRDY }
[  110.582183] ata3: hard resetting link
[  111.070441] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  111.072173] ata3.00: configured for UDMA/133
[  111.072198] ata3.00: device reported invalid CHS sector 0
[  111.07] ata3.00: device reported invalid CHS sector 0
[  111.072235] ata3: EH complete
[  141.511934] ata3.00: NCQ disabled due to excessive errors
[  141.511943] ata3.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action
0x6 frozen
[  141.512904] ata3.00: failed command: WRITE FPDMA QUEUED
[  141.513894] ata3.00: cmd 61/00:80:21:a3:eb/08:00:58:00:00/40 tag 16
ncq 1048576 out
[  141.516018] ata3.00: status: { DRDY }
[  141.517106] ata3.00: failed command: WRITE FPDMA QUEUED
[  141.518224] ata3.00: cmd 61/58:88:21:ab:eb/05:00:58:00:00/40 tag 17
ncq 700416 out
[  141.520587] ata3.00: status: { DRDY }
[  141.521791] ata3: hard resetting link
[  142.011355] ata3: SATA link 

Since Linux 4.1: A lot of AMD-Vi IO_PAGE_FAULTs

2015-07-21 Thread Andreas Hartmann
Hello!

Since Linux 4.1, I'm getting a lot of IO_PAGE_FAULT like this one

[   17.048609] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0
domain=0x0008 address=0x40ebaaab00618000 flags=0x0010]

with different addresses:

0x40ebaaab00618000
0x40ebaaab00618040
0x
0x0180
0x00c0
0x0080
0x0100
0x0040
0x0140
0x01c0
0x0200
0x0240
0x0280

...

device=00:11.0 is:

# lspci -vvs 00:11.0
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40) (prog-if 01 [AHCI
1.0])
Subsystem: Gigabyte Technology Co., Ltd Device b002
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
SERR- http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: f_op->read seems to be always NULL since Linux 4.1

2015-06-27 Thread Andreas Hartmann

On Sat, Jun 27, 2015 at 8:10 PM, Richard Weinberger wrote:

On Sat, Jun 27, 2015 at 7:32 PM, Andreas Hartmann
 wrote:

[...]

See __vfs_read().
Your module most not rely on such internals.


Thanks for your hint to the function which exists since 3.19.

Is there a site out there which lists all relevant changes done for each 
kernel version and the recommendations how to correctly handle them?



Kind regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


f_op->read seems to be always NULL since Linux 4.1

2015-06-27 Thread Andreas Hartmann
Hello!

Given is a module like the following snippet running fine w/ Linux 4.0
and ext4 fs - but doesn't work w/ Linux 4.1 because f->f_op->read is not
defined any more (= NULL). Is this the intended behavior now?

vfs_read(f, buf, 128, &f->f_pos) works fine.


module.c

#include 
#include 
#include 
#include 

int init_module(void)
{
struct file *f;
char buf[128];
mm_segment_t fs;
int i;
int len=128;

for(i=0;if_op->read) {
f->f_op->read(f, buf, len, &f->f_pos);
printk(KERN_INFO "buf:%s\n",buf);
}
else {
printk(KERN_INFO "No read method\n");
}

set_fs(fs);

}
filp_close(f,NULL);
return 0;
}

void cleanup_module(void)
{
printk(KERN_INFO "My module is unloaded\n");
}
---

Makefile:
---
obj-m += module.o

all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean




Regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] PCI: quirk Atheros AR93xx to avoid bus reset

2015-01-12 Thread Andreas Hartmann
Hello Alex!

Alex Williamson wrote:
> On Mon, 2015-01-12 at 16:20 +0100, Andreas Hartmann wrote:
>> Alex Williamson wrote:
>>> On Thu, 2015-01-08 at 09:07 -0700, Bjorn Helgaas wrote:
>>>> On Fri, Nov 21, 2014 at 11:24:27AM -0700, Alex Williamson wrote:
>>>>> Reports against the TL-WDN4800 card indicate that PCI bus reset of
>>>>> this Atheros device cause system lock-ups and resets.  I've also
>>>>> been able to confirm this behavior on multiple systems.  The device
>>>>> never returns from reset and attempts to access config space of the
>>>>> device after reset result in hangs.  Blacklist bus reset for the
>>>>> device to avoid this issue.
>>>>>
>>>>> Reported-by: Andreas Hartmann 
>>>>> Signed-off-by: Alex Williamson 
>>>>> Tested-by: Andreas Hartmann 
>>>>
>>>> If I understand correctly, these two (patches 3 & 4) fix a v3.14 regression
>>>> caused by 425c1b223dac ("PCI: Add Virtual Channel to save/restore 
>>>> support").
>>>>
>>>> If so, these should go to for-linus for v3.19.  What about patches 1 & 2?
>>>> Do they fix a regression?  Is there a pointer to a bugzilla or problem
>>>> report about that issue?
>>>>
>>>> I don't understand the connection between 425c1b223dac and
>>>> PCI_DEV_FLAGS_NO_BUS_RESET, because 425c1b223dac doesn't seem to do any
>>>> resets.  Is that the wrong commit, or can you outline the connection for
>>>> me?
>>>
>>> TBH, I don't have a lot of faith in associating this to 425c1b223dac,
>>> I'm not sure how Andreas' bisect landed there. 
>>
>> Because removing this patch made it working again :-)
>>
>> And too:
>> http://thread.gmane.org/gmane.linux.kernel.pci/35170/focus=35984
>>
>> Kernel 2.10. and 2.12. and 2.13. did work fine for me. 2.14 is the first
>> kernel, which hangs the machine at startup of the VM. The userland
>> (qemu) didn't change in between.
> 
> s/2\./3\./

Thanks :-) It seems I don't like the number 3 :-)

> Ok, so what about VC save/restore (425c1b223dac) is the problem then?
> When we tried to determine that, you found that if we continue from the
> top of the save loop, everything works (ie. no VC state saved), but if
> you continue after the variable declaration of the same loop (ie. still
> no VC state saved), it breaks:
> 
> http://www.spinics.net/lists/linux-pci/msg36166.html
> 
> So, please forgive me if I don't have a whole lot of faith that
> 425c1b223dac is involved.

It's hard for me, too. Really. It's kind of mystique.

> We also both independently determined that this particular device never
> recovers from a PCI bus reset, even when done from userspace with setpci
> and absolutely no save/restore wrappers.

Yes.

>  Config space on the device is
> never accessible after the reset.

Yes.

>  Therefore, how could any sort of bus
> reset with save/restore ever work for this device?

I can't say. What I definitely can say, is that I never had problems
with running VMs w/ qemu until 3.14 came up. Do you think I'm lying? I
used 3.10. and 3.12. for long time w/o (known!) problems (3.12 only on
first start of VM). Otherwise I would have been here long time before :-))).

>> Therefore: from my point of view, it is a regression, because things
>> have been working < 2.14.
>>
>> Besides that: It is undoubted, that there is a problem with resetting
>> this card. But the difference between >= 3.14 and < 3.14 is, that < 3.14
>> has been working nevertheless. The patch
>> 425c1b223dac456d00a61fd6b451b6d1cf00d065 obviously changed something
>> which I can't say and I don't know off. Therefore, the quirk-patch is
>> definitely required, because things work completely fine again w/ this
>> patch.
>>
>> "Working" means for me here: I was able to start (and use) the VM w/o
>> crashing the machine and this isn't possible w/ unpatched 2.14+ any
>> more. Yes, w/ 2.12, I wasn't able to restart the VM (it then crashed the
>> machine), but w/ 2.10 even this was possible.
> 
> What?!  So v3.12 still had a machine crash when assigning this device.

Yes. If you *re*start the VM (long time, I didn't knew that fact at all
- I just discovered it during testing while analyzing the problem :-)).
The first start (after reboot) was not a problem. This was the usual use
case here :-)).

Believe me, I'm really convinced that this card does have a problem with
resets. I'm just wonde

Re: [PATCH 4/4] PCI: quirk Atheros AR93xx to avoid bus reset

2015-01-12 Thread Andreas Hartmann
Alex Williamson wrote:
> On Thu, 2015-01-08 at 09:07 -0700, Bjorn Helgaas wrote:
>> On Fri, Nov 21, 2014 at 11:24:27AM -0700, Alex Williamson wrote:
>>> Reports against the TL-WDN4800 card indicate that PCI bus reset of
>>> this Atheros device cause system lock-ups and resets.  I've also
>>> been able to confirm this behavior on multiple systems.  The device
>>> never returns from reset and attempts to access config space of the
>>> device after reset result in hangs.  Blacklist bus reset for the
>>> device to avoid this issue.
>>>
>>> Reported-by: Andreas Hartmann 
>>> Signed-off-by: Alex Williamson 
>>> Tested-by: Andreas Hartmann 
>>
>> If I understand correctly, these two (patches 3 & 4) fix a v3.14 regression
>> caused by 425c1b223dac ("PCI: Add Virtual Channel to save/restore support").
>>
>> If so, these should go to for-linus for v3.19.  What about patches 1 & 2?
>> Do they fix a regression?  Is there a pointer to a bugzilla or problem
>> report about that issue?
>>
>> I don't understand the connection between 425c1b223dac and
>> PCI_DEV_FLAGS_NO_BUS_RESET, because 425c1b223dac doesn't seem to do any
>> resets.  Is that the wrong commit, or can you outline the connection for
>> me?
> 
> TBH, I don't have a lot of faith in associating this to 425c1b223dac,
> I'm not sure how Andreas' bisect landed there. 

Because removing this patch made it working again :-)

And too:
http://thread.gmane.org/gmane.linux.kernel.pci/35170/focus=35984

Kernel 2.10. and 2.12. and 2.13. did work fine for me. 2.14 is the first
kernel, which hangs the machine at startup of the VM. The userland
(qemu) didn't change in between.

Therefore: from my point of view, it is a regression, because things
have been working < 2.14.

Besides that: It is undoubted, that there is a problem with resetting
this card. But the difference between >= 3.14 and < 3.14 is, that < 3.14
has been working nevertheless. The patch
425c1b223dac456d00a61fd6b451b6d1cf00d065 obviously changed something
which I can't say and I don't know off. Therefore, the quirk-patch is
definitely required, because things work completely fine again w/ this
patch.

"Working" means for me here: I was able to start (and use) the VM w/o
crashing the machine and this isn't possible w/ unpatched 2.14+ any
more. Yes, w/ 2.12, I wasn't able to restart the VM (it then crashed the
machine), but w/ 2.10 even this was possible.


> IME, this device cannot,
> and has never been able to handle a bus reset.  A simple setpci
> experiment on the commandline can confirm this.  What I think happened
> is that with the PCI bus reset infrastructure we added, we switched QEMU
> to prefer PCI bus resets over things like PM D3hot->D0 resets.  So it's
> just more prolific use of bus resets by userspace.
> 
> There's also no regression in 1 & 2, PM reset has never done anything
> useful on those devices.  Thanks,
> 
> Alex
> 
>>> ---
>>>
>>>  drivers/pci/quirks.c |   14 ++
>>>  1 file changed, 14 insertions(+)
>>>
>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>> index 561e10d..ebbd5b4 100644
>>> --- a/drivers/pci/quirks.c
>>> +++ b/drivers/pci/quirks.c
>>> @@ -3029,6 +3029,20 @@ static void quirk_no_pm_reset(struct pci_dev *dev)
>>>  DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_ATI, PCI_ANY_ID,
>>>PCI_CLASS_DISPLAY_VGA, 8, quirk_no_pm_reset);
>>>  
>>> +static void quirk_no_bus_reset(struct pci_dev *dev)
>>> +{
>>> +   dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
>>> +}
>>> +
>>> +/*
>>> + * Atheros AR93xx chips do not behave after a bus reset.  The device will
>>> + * throw a Link Down error on AER capable system and regardless of AER,
>>> + * config space of the device is never accessible again and typically
>>> + * causes the system to hang or reset when access is attempted.
>>> + * http://www.spinics.net/lists/linux-pci/msg34797.html
>>> + */
>>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0030, 
>>> quirk_no_bus_reset);
>>> +
>>>  #ifdef CONFIG_ACPI
>>>  /*
>>>   * Apple: Shutdown Cactus Ridge Thunderbolt controller.
>>>
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Revert "cfg80211: make WEXT compatibility unselectable"

2015-01-01 Thread Andreas Hartmann
Arend van Spriel wrote:
> On 12/31/14 16:14, Andreas Hartmann wrote:
[...]
>> All in all:
>> If you want to get rid of wext, you still have to go a *very* long way
>> to get the same *stable* and high throughput quality with *all* chips
>> depending on mac80211 and not just a few flagship drivers like Atheros.
> 
> Hi Andreas,
> 
> That's a nice list of unrelated stuff. This has all nothing to do with
> WEXT. Actually, you can build rt5572sta with cfg80211 support
> (RT_CFG80211_SUPPORT).

You seem to know sources I don't know off. Could you please tell me,
where to find them?

I have DPO_RT5572_LinuxSTA_2.6.0.1_20120629 which doesn't compile with
HAS_CFG80211_SUPPORT=y because -DCONFIG_AP_SUPPORT, on which
RT_CFG80211_SUPPORT relies, is broken.

DPO_RT5572_LinuxSTA_2.6.1.3_20121022 removed the necessary broken AP
code completely.

> This thread is about the configuration API and
> not about driver performance.

I know.

I tried to show, why WEXT as a whole is still necessary even if there is
a mac80211 based driver, because of the weakness of rt2800usb:
Nip it in the bud.



Kind regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Revert "cfg80211: make WEXT compatibility unselectable"

2014-12-31 Thread Andreas Hartmann
Jiri Kosina wrote:
> On Wed, 31 Dec 2014, Arend van Spriel wrote:
> 
>> The thing with WEXT is that it will stay as is. So if tools like wicd 
>> want to support new features like P2P it will need to make the switch. I 
>> checked out wicd repo and found a number of iwconfig calls and they kick 
>> off wpa_supplicant with wext driver.
> 
> Unfortunately this is by no means just about wicd. I have already received 
> a few off-list mails from people who were wondering why their home-made 
> scripts / tools, which are running 'iwconfig' directly suddenly stopped to 
> work, and that it was indeed fallout of WEXT going away. Given the very 
> short time this has been in mainline, you can probably imagine the 
> fireworks once this appears in major release.

It is not just the userspace tools (I prefer them, too), which need
wext, but a lot of drivers, too, such as Mediathek drivers e.g. which
perform *much* better compared to rt2x00, especially concerning USB
chips like the one used by Linksys AE3000 (3x3 Mimo)
(https://wikidevi.com/wiki/Linksys_AE3000), which achieves average
throughputs around 14 MB/s *average* with scp of big (> 10 GB) crypted
files even through reinforced-concrete floor(!) - rt2x00 is *far* away
of providing such a performance.

Next bad point of rt2x00 e.g. is the huge CPU overhead - compare
rt5572sta on Raspi with rt2x00 running netperf and you will see the huge
problem of rt2x00 (which is covered on x86 by mostly oversized multi
core CPUs).

Another big advantage of rt5572sta is: it is *stable* over a lot of
kernel versions (as long as the kernel didn't break interfaces - but
there are patches to catch them).

Even ath9k, which usually is a really fine driver, is broken on some
kernel versions (link and throughput is not stable - my use case depends
*heavily* on very high and longterm stable throughput). That's why I'm
using a VM for my ath9k-device to be independent of these quality
problems of mac80211 (or maybe ath9k - don't know) over different kernel
versions.


All in all:
If you want to get rid of wext, you still have to go a *very* long way
to get the same *stable* and high throughput quality with *all* chips
depending on mac80211 and not just a few flagship drivers like Atheros.



Kind regards,
Andreas Hartmann
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] PCI: quirk Atheros AR93xx to avoid bus reset

2014-12-26 Thread Andreas Hartmann
Hello Bjorn,

I'm running this patch and the corresponding "[PATCH 3/4] PCI: Allow
device quirks to exclude bus reset" patch meanwhile since a month w/
kernel 3.14.x and couldn't find any problem. Would it be possible to
apply these patches to main kernel? Or even to lt-kernel 3.14?


Thanks.
kind regards,
Andreas Hartmann


Alex Williamson wrote:
> Reports against the TL-WDN4800 card indicate that PCI bus reset of
> this Atheros device cause system lock-ups and resets.  I've also
> been able to confirm this behavior on multiple systems.  The device
> never returns from reset and attempts to access config space of the
> device after reset result in hangs.  Blacklist bus reset for the
> device to avoid this issue.
> 
> Reported-by: Andreas Hartmann 
> Signed-off-by: Alex Williamson 
> Tested-by: Andreas Hartmann 
> ---
> 
>  drivers/pci/quirks.c |   14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 561e10d..ebbd5b4 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3029,6 +3029,20 @@ static void quirk_no_pm_reset(struct pci_dev *dev)
>  DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_ATI, PCI_ANY_ID,
>  PCI_CLASS_DISPLAY_VGA, 8, quirk_no_pm_reset);
>  
> +static void quirk_no_bus_reset(struct pci_dev *dev)
> +{
> + dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
> +}
> +
> +/*
> + * Atheros AR93xx chips do not behave after a bus reset.  The device will
> + * throw a Link Down error on AER capable system and regardless of AER,
> + * config space of the device is never accessible again and typically
> + * causes the system to hang or reset when access is attempted.
> + * http://www.spinics.net/lists/linux-pci/msg34797.html
> + */
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0030, quirk_no_bus_reset);
> +
>  #ifdef CONFIG_ACPI
>  /*
>   * Apple: Shutdown Cactus Ridge Thunderbolt controller.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Strange problem with vxlan!

2014-01-08 Thread Andreas Hartmann
Hi!

For all others, having problems w/ broken multicast:

See the solution here:
http://article.gmane.org/gmane.linux.kernel/1625590


Regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Out of the box (ootb) multicast broken since Linux 3.5 until min. 3.12.

2014-01-08 Thread Andreas Hartmann
Hello!

This patch:

commit c5c23260594c5701af66ef754916775ba6a46bbc
Author: Herbert Xu 
Date:   Fri Apr 13 02:37:42 2012 +

bridge: Add multicast_querier toggle and disable queries by default

Sending general queries was implemented as an optimisation to speed
up convergence on start-up.  In order to prevent interference with
multicast routers a zero source address has to be used.

Unfortunately these packets appear to cause some multicast-aware
switches to misbehave, e.g., by disrupting multicast packets to us.

Since the multicast snooping feature still functions without sending
our own queries, this patch will change the default to not send
queries.

For those that need queries in order to speed up convergence on
start-up, a toggle is provided to restore the previous behaviour.

Signed-off-by: Herbert Xu 
Signed-off-by: David S. Miller 


incompatibly broke ootb multicast in Linux until 3.12 or even higher
(didn't test) for this use case:

http://thread.gmane.org/gmane.linux.kernel/1622550


It is necessary to manually add this switch

echo "1" > /sys/devices/virtual/net/br0/bridge/multicast_querier

to get multicast working again.


Would be nice to get the old behaviour (= working multicast ootb) back
again. This would have saved a lot of time, probably not only in my case
here (e.g. see https://bugzilla.redhat.com/show_bug.cgi?id=880035).


Regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Strange problem with vxlan!

2014-01-05 Thread Andreas Hartmann
On Fri, 3 Jan 2014 15:27:19 +0100
Andreas Hartmann  wrote:

[...]

> Now the problem:
> 
> If the VM (=AP) runs e.g. Linux 3.4.x, all is working fine as expected. 
> If the VM runs 3.12.x or even 3.10.x, the tunnel works fine a few minutes 
> after creation. Afterwards it is broken.
> 
> Broken means:
> A "dhcpcd eth0" e.g. on the notebook times out, doesn't work any more. Traces 
> show:
> The udp-tunnel-packages sent by the STA through vxlan0 can be seen on the 
> host / tap0, but they can't be seen on vxlan0 (if it works, they can be seen 
> on the vxlan0 device, too).
> 
> On the host runs Linux 3.10.x, on the STA 3.11.6.

Some more findings:

- Problem can be seen with Linux 3.7 in the AP (VM), too.
- *Problem disappears* if the bridge device br0 on the host is set to
  promiscuous mode.
- Sometimes, there can be seen the warning 
  "notebook dhcpcd[2784]: eth0: bad UDP checksum, ignoring" 
  when starting dhcpcd on the notebook with br0 / host set to promiscuous
  mode (nevertheless dhcpcd worked fine). I never saw this warning
  before.


Any idea how to fix the problem w/o running the bridge br0 on the host
in promiscuous mode?



Thanks for any hint,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Strange problem with vxlan!

2014-01-03 Thread Andreas Hartmann
Given is the following network architecture: connection of a virtual bridge br0 
and a remote ethernet-switch through vxlan tunnel via WLAN:



host[br0: tap0,vxlan0]
|||
|===
| ||
| ||
VM (WLAN access point)  [br0: eth0, wlan0]||
  |   ||
  |   ||
  -   ||
  |   ||
STA [wlan0, br0: eth0, vxlan0]
  |
  |
   |--|
Switch
   |
   --
|
notebook [eth0]



The configuration of the vxlan is:

host: route add -net 224.0.0.0 netmask 240.0.0.0 dev br0
  ip li add vxlan0 type vxlan id 1 group 239.1.1.1 dev br0

STA:  route add -net 224.0.0.0 netmask 240.0.0.0 dev wlan0
  ip li add vxlan0 type vxlan id 1 group 239.1.1.1 dev wlan0

This means: the endpoints of the vxlan tunnel are br0 (host) and STA (wlan0). 
Between them, there is the WLAN AP (a VM belonging to the host).


Now the problem:

If the VM (=AP) runs e.g. Linux 3.4.x, all is working fine as expected. 
If the VM runs 3.12.x or even 3.10.x, the tunnel works fine a few minutes after 
creation. Afterwards it is broken.

Broken means:
A "dhcpcd eth0" e.g. on the notebook times out, doesn't work any more. Traces 
show:
The udp-tunnel-packages sent by the STA through vxlan0 can be seen on the host 
/ tap0, but they can't be seen on vxlan0 (if it works, they can be seen on the 
vxlan0 device, too).

On the host runs Linux 3.10.x, on the STA 3.11.6.


Any idea why vxlan is broken w/ Linux 3.12.x or 3.10.x on the VM (AP)?



Thanks in advance for any hint,
regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 102/127] iommu/amd: Workaround for ERBT1312

2013-06-29 Thread Andreas Hartmann

Joerg Roedel schrieb:

On Sat, Jun 29, 2013 at 07:54:20AM +0200, Andreas Hartmann wrote:

Sorry, but it doesn't work for me at all :-(. Behaviour is unchanged. It
is exactly as described in the other mail: at the moment of binding vfio
to 14.0, the fire begins.


Hmm, VFIO attaches the device to a new domain. That clears the bit, how
about this patch:


Didn't help, too :-(


Regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 102/127] iommu/amd: Workaround for ERBT1312

2013-06-28 Thread Andreas Hartmann
Joerg Roedel wrote:
> Alex, Andreas,
> 
> On Fri, Jun 28, 2013 at 08:42:05PM +0200, Andreas Hartmann wrote:
>> You're right, there is exactly one entry directly after loading of vfio.
>> I can see this message, too, with linux 3.4.43.
> 
> Can you please test this patch? It should reduce the noise
> significantly, but a few of those error messages are still expected.

Sorry, but it doesn't work for me at all :-(. Behaviour is unchanged. It
is exactly as described in the other mail: at the moment of binding vfio
to 14.0, the fire begins.

echo "1002 4385" > /sys/bus/pci/drivers/vfio-pci/new_id
echo :00:14.0 > /sys/bus/pci/devices/:00:14.0/driver/unbind
echo :00:14.0 > /sys/bus/pci/drivers/vfio-pci/bind


Regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 102/127] iommu/amd: Workaround for ERBT1312

2013-06-28 Thread Andreas Hartmann
Alex Williamson wrote:
> On Fri, 2013-06-28 at 18:11 +0200, Andreas Hartmann wrote:
>> Hello Joerg, hello Alex,
>>
>> the subsequent patch and the patch "iommu/amd: Re-enable IOMMU event log
>> interrupt after handling." 925fe08bce38d1ff052fe2209b9e2b8d5fbb7f98
>> spread /var/log/messages with the following line (> 700 lines/second)
>> right after loading vfio:
>>
>> AMD-Vi: Event logged [IO_PAGE_FAULT device=00:14.0 domain=0x 
>> address=0x00fdf9103300 flags=0x0600]
> 
> That's interesting, I PXE boot my system from one NIC then use a
> different NIC for the iSCSI root.  The PXE boot NIC now screams like
> this, _until_ I attach it to vfio, then it quiets down.

Hmm, I just remembered an active workaround I implemented to "resolve"
an error like this when starting my VM to passthrough my intel pci
ethernet device since I applied a new kvm version:


qemu-kvm: -device vfio-pci,host=06:06.0: vfio: failed to set iommu for
container: Device or resource busy

qemu-kvm: -device vfio-pci,host=06:06.0: vfio: failed to setup container
for group 12

qemu-kvm: -device vfio-pci,host=06:06.0: vfio: failed to get group 12

qemu-kvm: -device vfio-pci,host=06:06.0: Device 'vfio-pci' could not be
initialized


The workaround was to bind the individual multifunction devices during
boot one time to vfio and release them after 2 seconds again and rebind
them to the original drivers as they where bound before (if it was bound
to any).

I did this with a script beginning like this:

#!/bin/sh
modprobe vfio-pci

echo "1002 4385" > /sys/bus/pci/drivers/vfio-pci/new_id
echo :00:14.0 > /sys/bus/pci/devices/:00:14.0/driver/unbind
echo :00:14.0 > /sys/bus/pci/drivers/vfio-pci/bind
...

sleep 2

echo :00:14.0 > /sys/bus/pci/drivers/vfio-pci/unbind
echo "1002 4385" > /sys/bus/pci/drivers/vfio-pci/remove_id
...

The logs in messages:

Jun 28 15:54:12 . kernel: [   48.860147] VFIO - User Level meta-driver version: 
0.3
Jun 28 15:54:12 . kernel: [   48.875243] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=00:14.0 domain=0x address=0x00fdf9103300 flags=0x0600]
...

Therefore, the logoutput most probably started after device 14.0 was
bound to vfio. If it would have started after removing vfio, I would
have expected 2 seconds between the start messages of vfio and the first
occurrence of the IO_PAGE_FAULT.

Today, I'm using kvm 1.3.1 and it isn't necessary to use the complete
workaround anymore. It is enough to bind / unbind the pci bridge
as described above before starting the VM with the passed through pci
ethernet device.
Because I now don't touch the 14.0 device any more, the IO_PAGE_FAULT
messages disappeared completely.

@Joerg:
Anyway, I'm going to test your provided patch tomorrow!

BTW: what does it mean: IO_PAGE_FAULT - what do I have to expect if I
see this message?



Thanks,
regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 102/127] iommu/amd: Workaround for ERBT1312

2013-06-28 Thread Andreas Hartmann
Hello Joerg,

Joerg Roedel wrote:
> Hi Andreas,
> 
> On Fri, Jun 28, 2013 at 06:11:36PM +0200, Andreas Hartmann wrote:
>> Hello Joerg, hello Alex,
>>
>> the subsequent patch and the patch "iommu/amd: Re-enable IOMMU event log
>> interrupt after handling." 925fe08bce38d1ff052fe2209b9e2b8d5fbb7f98
>> spread /var/log/messages with the following line (> 700 lines/second)
>> right after loading vfio:
>>
>> AMD-Vi: Event logged [IO_PAGE_FAULT device=00:14.0 domain=0x 
>> address=0x00fdf9103300 flags=0x0600]
>>
>> lspci -vvvs 0:14.0
>> 00:14.0 SMBus: Advanced Micro Devices [AMD] nee ATI SBx00 SMBus Controller 
>> (rev 42)
>> Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
>> Stepping- SERR- FastB2B- DisINTx+
>> Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
>> SERR-  
> Most likely a BIOS issue that is uncovered by re-enabling the event-log
> interrupt patch. The device itself is only used by the BIOS and not by
> the Linux kernel

Thanks for this info! Good to know.

[...]

>> I removed the two mentioned patches and all is working
>> fine again as before.
> 
> Without these two patches, can you check dmesg after boot if there are
> other lines which report IO_PAGE_FAULTs?

You're right, there is exactly one entry directly after loading of vfio.
I can see this message, too, with linux 3.4.43.


Regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 102/127] iommu/amd: Workaround for ERBT1312

2013-06-28 Thread Andreas Hartmann
Hello Joerg, hello Alex,

the subsequent patch and the patch "iommu/amd: Re-enable IOMMU event log
interrupt after handling." 925fe08bce38d1ff052fe2209b9e2b8d5fbb7f98
spread /var/log/messages with the following line (> 700 lines/second)
right after loading vfio:

AMD-Vi: Event logged [IO_PAGE_FAULT device=00:14.0 domain=0x 
address=0x00fdf9103300 flags=0x0600]

lspci -vvvs 0:14.0
00:14.0 SMBus: Advanced Micro Devices [AMD] nee ATI SBx00 SMBus Controller (rev 
42)
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
SERR-  the SSD was fast enough to
cover it silently). I saw it the first time I rebooted because X didn't start 
any more because
the /var partition was completely full. 

I removed the two mentioned patches and all is working
fine again as before.

Any idea?


Thanks,
kind regards,
Andreas


Greg Kroah-Hartman wrote:
> 3.9-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Joerg Roedel 
> 
> commit d3263bc29706e42f74d8800807c2dedf320d77f1 upstream.
> 
> Work around an IOMMU  hardware bug where clearing the
> EVT_INT or PPR_INT bit in the status register may race with
> the hardware trying to set it again. When not handled the
> bit might not be cleared and we lose all future event or ppr
> interrupts.
> 
> Reported-by: Suravee Suthikulpanit 
> Signed-off-by: Joerg Roedel 
> Signed-off-by: Greg Kroah-Hartman 
> 
> ---
>  drivers/iommu/amd_iommu.c |   34 ++
>  1 file changed, 26 insertions(+), 8 deletions(-)
> 
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -700,14 +700,23 @@ retry:
>  
>  static void iommu_poll_events(struct amd_iommu *iommu)
>  {
> - u32 head, tail;
> + u32 head, tail, status;
>   unsigned long flags;
>  
> - /* enable event interrupts again */
> - writel(MMIO_STATUS_EVT_INT_MASK, iommu->mmio_base + MMIO_STATUS_OFFSET);
> -
>   spin_lock_irqsave(&iommu->lock, flags);
>  
> + /* enable event interrupts again */
> + do {
> + /*
> +  * Workaround for Erratum ERBT1312
> +  * Clearing the EVT_INT bit may race in the hardware, so read
> +  * it again and make sure it was really cleared
> +  */
> + status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
> + writel(MMIO_STATUS_EVT_INT_MASK,
> +iommu->mmio_base + MMIO_STATUS_OFFSET);
> + } while (status & MMIO_STATUS_EVT_INT_MASK);
> +
>   head = readl(iommu->mmio_base + MMIO_EVT_HEAD_OFFSET);
>   tail = readl(iommu->mmio_base + MMIO_EVT_TAIL_OFFSET);
>  
> @@ -744,16 +753,25 @@ static void iommu_handle_ppr_entry(struc
>  static void iommu_poll_ppr_log(struct amd_iommu *iommu)
>  {
>   unsigned long flags;
> - u32 head, tail;
> + u32 head, tail, status;
>  
>   if (iommu->ppr_log == NULL)
>   return;
>  
> - /* enable ppr interrupts again */
> - writel(MMIO_STATUS_PPR_INT_MASK, iommu->mmio_base + MMIO_STATUS_OFFSET);
> -
>   spin_lock_irqsave(&iommu->lock, flags);
>  
> + /* enable ppr interrupts again */
> + do {
> + /*
> +  * Workaround for Erratum ERBT1312
> +  * Clearing the PPR_INT bit may race in the hardware, so read
> +  * it again and make sure it was really cleared
> +  */
> + status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
> + writel(MMIO_STATUS_PPR_INT_MASK,
> +iommu->mmio_base + MMIO_STATUS_OFFSET);
> + } while (status & MMIO_STATUS_PPR_INT_MASK);
> +
>   head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
>   tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] pci: ACS quirk for AMD southbridge

2013-06-26 Thread Andreas Hartmann
Alex Williamson wrote:
> On Wed, 2013-06-26 at 17:14 +0200, Andreas Hartmann wrote:
>> Bjorn Helgaas wrote:
>>> [fix Joerg's email address]
>>>
>>> On Tue, Jun 25, 2013 at 10:15 PM, Bjorn Helgaas  wrote:
>>>> On Wed, Jul 11, 2012 at 11:18 PM, Alex Williamson
>>>>  wrote:
>>>>> We've confirmed that peer-to-peer between these devices is
>>>>> not possible.  We can therefore claim that they support a
>>>>> subset of ACS.
>>>>>
>>>>> Signed-off-by: Alex Williamson 
>>>>> Cc: Joerg Roedel 
>>>>> ---
>>>>>
>>>>> Two things about this patch make me a little nervous.  The
>>>>> first is that I'd really like to have a pci_is_pcie() test
>>>>> in pci_mf_no_p2p_acs_enabled(), but these devices don't
>>>>> have a PCIe capability.  That means that if there was a
>>>>> topology where these devices sit on a legacy PCI bus,
>>>>> we incorrectly return that we're ACS safe here.  That leads
>>>>> to my second problem, pciids seems to suggest that some of
>>>>> these functions have been around for a while.  Is it just
>>>>> this package that's peer-to-peer safe, or is it safe to
>>>>> assume that any previous assembly of these functions is
>>>>> also p2p safe.  Maybe we need to factor in device revs if
>>>>> that uniquely identifies this package?
>>>>>
>>>>> Looks like another useful device to potentially quirk
>>>>> would be:
>>>>>
>>>>> 00:15.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI 
>>>>> SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
>>>>> 00:15.1 PCI bridge: Advanced Micro Devices [AMD] nee ATI 
>>>>> SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1)
>>>>> 00:15.2 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB900 PCI to PCI 
>>>>> bridge (PCIE port 2)
>>>>> 00:15.3 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB900 PCI to PCI 
>>>>> bridge (PCIE port 3)
>>>>>
>>>>> 00:15.0 0604: 1002:43a0
>>>>> 00:15.1 0604: 1002:43a1
>>>>> 00:15.2 0604: 1002:43a2
>>>>> 00:15.3 0604: 1002:43a3
>>>>>
>>>>>  drivers/pci/quirks.c |   29 +
>>>>>  1 file changed, 29 insertions(+)
>>>>>
>>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>>> index 4ebc865..2c84961 100644
>>>>> --- a/drivers/pci/quirks.c
>>>>> +++ b/drivers/pci/quirks.c
>>>>> @@ -3271,11 +3271,40 @@ struct pci_dev *pci_get_dma_source(struct pci_dev 
>>>>> *dev)
>>>>> return pci_dev_get(dev);
>>>>>  }
>>>>>
>>>>> +/*
>>>>> + * Multifunction devices that do not support peer-to-peer between
>>>>> + * functions can claim to support a subset of ACS.  Such devices
>>>>> + * effectively enable request redirect (RR) and completion redirect (CR)
>>>>> + * since all transactions are redirected to the upstream root complex.
>>>>> + */
>>>>> +static int pci_mf_no_p2p_acs_enabled(struct pci_dev *dev, u16 acs_flags)
>>>>> +{
>>>>> +   if (!dev->multifunction)
>>>>> +   return -ENODEV;
>>>>> +
>>>>> +   /* Filter out flags not applicable to multifunction */
>>>>> +   acs_flags &= (PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC | PCI_ACS_DT);
>>>>> +
>>>>> +   return acs_flags & ~(PCI_ACS_RR | PCI_ACS_CR) ? 0 : 1;
>>>>> +}
>>>>> +
>>>>>  static const struct pci_dev_acs_enabled {
>>>>> u16 vendor;
>>>>> u16 device;
>>>>> int (*acs_enabled)(struct pci_dev *dev, u16 acs_flags);
>>>>>  } pci_dev_acs_enabled[] = {
>>>>> +   /*
>>>>> +* AMD/ATI multifunction southbridge devices.  AMD has confirmed
>>>>> +* that peer-to-peer between these devices is not possible, so
>>>>> +* they do support a subset of ACS even though the capability is
>>>>> +* not exposed in config space.
>>>>> +*/
>>>>> +   { PCI_VENDOR_ID_ATI, 0x4385, pci_mf_no_p2p_acs_enabled },
>>>>> +

Re: [PATCH RFC] pci: ACS quirk for AMD southbridge

2013-06-26 Thread Andreas Hartmann
Bjorn Helgaas wrote:
> [fix Joerg's email address]
> 
> On Tue, Jun 25, 2013 at 10:15 PM, Bjorn Helgaas  wrote:
>> On Wed, Jul 11, 2012 at 11:18 PM, Alex Williamson
>>  wrote:
>>> We've confirmed that peer-to-peer between these devices is
>>> not possible.  We can therefore claim that they support a
>>> subset of ACS.
>>>
>>> Signed-off-by: Alex Williamson 
>>> Cc: Joerg Roedel 
>>> ---
>>>
>>> Two things about this patch make me a little nervous.  The
>>> first is that I'd really like to have a pci_is_pcie() test
>>> in pci_mf_no_p2p_acs_enabled(), but these devices don't
>>> have a PCIe capability.  That means that if there was a
>>> topology where these devices sit on a legacy PCI bus,
>>> we incorrectly return that we're ACS safe here.  That leads
>>> to my second problem, pciids seems to suggest that some of
>>> these functions have been around for a while.  Is it just
>>> this package that's peer-to-peer safe, or is it safe to
>>> assume that any previous assembly of these functions is
>>> also p2p safe.  Maybe we need to factor in device revs if
>>> that uniquely identifies this package?
>>>
>>> Looks like another useful device to potentially quirk
>>> would be:
>>>
>>> 00:15.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB700/SB800/SB900 
>>> PCI to PCI bridge (PCIE port 0)
>>> 00:15.1 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB700/SB800/SB900 
>>> PCI to PCI bridge (PCIE port 1)
>>> 00:15.2 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB900 PCI to PCI 
>>> bridge (PCIE port 2)
>>> 00:15.3 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB900 PCI to PCI 
>>> bridge (PCIE port 3)
>>>
>>> 00:15.0 0604: 1002:43a0
>>> 00:15.1 0604: 1002:43a1
>>> 00:15.2 0604: 1002:43a2
>>> 00:15.3 0604: 1002:43a3
>>>
>>>  drivers/pci/quirks.c |   29 +
>>>  1 file changed, 29 insertions(+)
>>>
>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>> index 4ebc865..2c84961 100644
>>> --- a/drivers/pci/quirks.c
>>> +++ b/drivers/pci/quirks.c
>>> @@ -3271,11 +3271,40 @@ struct pci_dev *pci_get_dma_source(struct pci_dev 
>>> *dev)
>>> return pci_dev_get(dev);
>>>  }
>>>
>>> +/*
>>> + * Multifunction devices that do not support peer-to-peer between
>>> + * functions can claim to support a subset of ACS.  Such devices
>>> + * effectively enable request redirect (RR) and completion redirect (CR)
>>> + * since all transactions are redirected to the upstream root complex.
>>> + */
>>> +static int pci_mf_no_p2p_acs_enabled(struct pci_dev *dev, u16 acs_flags)
>>> +{
>>> +   if (!dev->multifunction)
>>> +   return -ENODEV;
>>> +
>>> +   /* Filter out flags not applicable to multifunction */
>>> +   acs_flags &= (PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC | PCI_ACS_DT);
>>> +
>>> +   return acs_flags & ~(PCI_ACS_RR | PCI_ACS_CR) ? 0 : 1;
>>> +}
>>> +
>>>  static const struct pci_dev_acs_enabled {
>>> u16 vendor;
>>> u16 device;
>>> int (*acs_enabled)(struct pci_dev *dev, u16 acs_flags);
>>>  } pci_dev_acs_enabled[] = {
>>> +   /*
>>> +* AMD/ATI multifunction southbridge devices.  AMD has confirmed
>>> +* that peer-to-peer between these devices is not possible, so
>>> +* they do support a subset of ACS even though the capability is
>>> +* not exposed in config space.
>>> +*/
>>> +   { PCI_VENDOR_ID_ATI, 0x4385, pci_mf_no_p2p_acs_enabled },
>>> +   { PCI_VENDOR_ID_ATI, 0x439c, pci_mf_no_p2p_acs_enabled },
>>> +   { PCI_VENDOR_ID_ATI, 0x4383, pci_mf_no_p2p_acs_enabled },
>>> +   { PCI_VENDOR_ID_ATI, 0x439d, pci_mf_no_p2p_acs_enabled },
>>> +   { PCI_VENDOR_ID_ATI, 0x4384, pci_mf_no_p2p_acs_enabled },
>>> +   { PCI_VENDOR_ID_ATI, 0x4399, pci_mf_no_p2p_acs_enabled },
>>> { 0 }
>>>  };
>>>
>>>
>>
>> I was looking for something else and found this old email.  This patch
>> hasn't been applied and I haven't seen any discussion about it.  Is it
>> still of interest?  It seems relevant to the current ACS discussion
>> [1].

It is absolutely relevant. I always have to patch my kernel to get it
working to put my pci device to VM. Meanwhile I'm doing it for
kernel 3.9. I would be very glad to get these patches to the kernel as
they don't do anything bad!

My multifunction devices are the devices defined in the patch. My
current pci device passed through is a intel ethernet device:

-[:00]-+-00.0  Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge 
(external gfx0 port B)
   +-00.2  Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
Management Unit (IOMMU)
   +-02.0-[01]--+-00.0  Advanced Micro Devices [AMD] nee ATI Turks 
[Radeon HD 6570]
   |\-00.1  Advanced Micro Devices [AMD] nee ATI Turks HDMI 
Audio [Radeon HD 6000 Series]
   +-04.0-[02]00.0  Etron Technology, Inc. EJ168 USB 3.0 Host 
Controller
   +-05.0-[03]00.0  Atheros Communications Inc. AR9300 Wireless LAN 
adaptor
   +-

Re: [ 104/173] rt2x00: Dont let mac80211 send a BAR when an AMPDU subframe fails

2013-01-07 Thread Andreas Hartmann
Hello Stanislaw!

Stanislaw Gruszka wrote:
> On Mon, Jan 07, 2013 at 07:38:35PM +0100, Andreas Hartmann wrote:
>> Stanislaw Gruszka wrote:
>>> On Mon, Jan 07, 2013 at 04:04:01PM +0100, Andreas Hartmann wrote:
>>>> Ben Hutchings wrote:
>>>>> On Mon, 2013-01-07 at 09:10 +0100, Stanislaw Gruszka wrote:
>>>>>> On Mon, Jan 07, 2013 at 09:05:32AM +0100, Stanislaw Gruszka wrote:
>>>>>>>> To be clear, I have all of these in the queue:
>>>>>>>>
>>>>>>>> be03d4a45c09 rt2x00: Don't let mac80211 send a BAR when an AMPDU 
>>>>>>>> subframe fails
>>>>>>>> 5b632fe85ec8 mac80211: introduce IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL
>>>>>>>> ab9d6e4ffe19 Revert: "rt2x00: Don't let mac80211 send a BAR when an 
>>>>>>>> AMPDU subframe fails"
>>>>>>>>
>>>>>>>> and I'm intending to drop/defer them all.
>>>>>>>
>>>>>>> Patch 3 is a revert of patch 1 (questioned patch). Please apply all 3 
>>>>>>> patches,
>>>>>>> or only patch 2.
>>>>>>
>>>>>> No, actually all 3 patches have to be applied. Because last one, except
>>>>>> revert, include flag IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL setting in 
>>>>>> rt2x00
>>>>>> driver, which make patch 2 work.
>>>>>
>>>>> Andreas said that that after ab9d6e4ffe19 there was still a regression.
>>>
>>> That's not true. There will be no regression after ab9d6e4ffe20. The
>>> only thing is that solution is not perfect. But perfect solution require
>>> lot of changes i.e. is not -stable appropriate (and does not exist 
>>> currently).
>>>
>>>>> But maybe he was confused.  I know I'm confused.
>>>> :-))
>>>>
>>>> No, the thing is:
>>>> rt2800pci misses an appropriate handling of aggregation (which meets the
>>>> requirements of mac80211).
>>>>
>>>> Both workarounds, mine and the new workaround from Stanislaw (which is
>>>> nothing more than a restricted version of my initial workaround), work
>>>
>>> Your workaround broke STA mode on some environment.
>>
>> Why are you sure, that this workaround doesn't break some other devices
>> running in AP mode? We believed at that time too, it wouldn't harm even
>> STA. But this was wrong for some (which?) devices.
> 
> Because it make behaviour the same as it was before 3.2, which introduce
> those issues.

You're so right, Stanislaw! I should have better looked again at your
patch before writing those stupid lines about differentiation between
STA and AP.

Please apologize!


Kind regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 104/173] rt2x00: Dont let mac80211 send a BAR when an AMPDU subframe fails

2013-01-07 Thread Andreas Hartmann
Hello Helmut!

Helmut Schaa wrote:
> On Mon, Jan 7, 2013 at 4:04 PM, Andreas Hartmann
>  wrote:
>> The solution would be IMHO, to implement an own aggregation handling,
>> maybe the same way as it was done for carl9170, which had the same problem:
>>
>> http://thread.gmane.org/gmane.linux.kernel.wireless.general/100793/focus=1405
>>
>> I prefer to have solutions (if one is known) instead of another workaround.
> 
> JFI, I'm just working on exactly that (handling BAR TX status in
> driver to implement proper RX reorder window flushing at the peer). I'll post 
> it for
> further testing to the rt2x00 list once I'm done.

Thank you for your time spent on this problem! I really appreciate it!


Kind regards!
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 104/173] rt2x00: Dont let mac80211 send a BAR when an AMPDU subframe fails

2013-01-07 Thread Andreas Hartmann
Stanislaw Gruszka wrote:
> On Mon, Jan 07, 2013 at 04:04:01PM +0100, Andreas Hartmann wrote:
>> Ben Hutchings wrote:
>>> On Mon, 2013-01-07 at 09:10 +0100, Stanislaw Gruszka wrote:
>>>> On Mon, Jan 07, 2013 at 09:05:32AM +0100, Stanislaw Gruszka wrote:
>>>>>> To be clear, I have all of these in the queue:
>>>>>>
>>>>>> be03d4a45c09 rt2x00: Don't let mac80211 send a BAR when an AMPDU 
>>>>>> subframe fails
>>>>>> 5b632fe85ec8 mac80211: introduce IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL
>>>>>> ab9d6e4ffe19 Revert: "rt2x00: Don't let mac80211 send a BAR when an 
>>>>>> AMPDU subframe fails"
>>>>>>
>>>>>> and I'm intending to drop/defer them all.
>>>>>
>>>>> Patch 3 is a revert of patch 1 (questioned patch). Please apply all 3 
>>>>> patches,
>>>>> or only patch 2.
>>>>
>>>> No, actually all 3 patches have to be applied. Because last one, except
>>>> revert, include flag IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL setting in 
>>>> rt2x00
>>>> driver, which make patch 2 work.
>>>
>>> Andreas said that that after ab9d6e4ffe19 there was still a regression.
> 
> That's not true. There will be no regression after ab9d6e4ffe20. The
> only thing is that solution is not perfect. But perfect solution require
> lot of changes i.e. is not -stable appropriate (and does not exist currently).
> 
>>> But maybe he was confused.  I know I'm confused.
>> :-))
>>
>> No, the thing is:
>> rt2800pci misses an appropriate handling of aggregation (which meets the
>> requirements of mac80211).
>>
>> Both workarounds, mine and the new workaround from Stanislaw (which is
>> nothing more than a restricted version of my initial workaround), work
> 
> Your workaround broke STA mode on some environment.

Why are you sure, that this workaround doesn't break some other devices
running in AP mode? We believed at that time too, it wouldn't harm even
STA. But this was wrong for some (which?) devices.


Anyway: As Helmut meanwhile mentioned that he thankfully works on a
solution now, I'm fine with the second round of workaround.



Kind regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 104/173] rt2x00: Dont let mac80211 send a BAR when an AMPDU subframe fails

2013-01-07 Thread Andreas Hartmann
Ben Hutchings wrote:
> On Mon, 2013-01-07 at 09:10 +0100, Stanislaw Gruszka wrote:
>> On Mon, Jan 07, 2013 at 09:05:32AM +0100, Stanislaw Gruszka wrote:
 To be clear, I have all of these in the queue:

 be03d4a45c09 rt2x00: Don't let mac80211 send a BAR when an AMPDU subframe 
 fails
 5b632fe85ec8 mac80211: introduce IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL
 ab9d6e4ffe19 Revert: "rt2x00: Don't let mac80211 send a BAR when an AMPDU 
 subframe fails"

 and I'm intending to drop/defer them all.
>>>
>>> Patch 3 is a revert of patch 1 (questioned patch). Please apply all 3 
>>> patches,
>>> or only patch 2.
>>
>> No, actually all 3 patches have to be applied. Because last one, except
>> revert, include flag IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL setting in rt2x00
>> driver, which make patch 2 work.
> 
> Andreas said that that after ab9d6e4ffe19 there was still a regression.
> But maybe he was confused.  I know I'm confused.

:-))

No, the thing is:
rt2800pci misses an appropriate handling of aggregation (which meets the
requirements of mac80211).

Both workarounds, mine and the new workaround from Stanislaw (which is
nothing more than a restricted version of my initial workaround), work
like this:
Let the peer do the aggregation handling. If it's not done by the peer,
the connection will break down.

Therefore:
The solution would be IMHO, to implement an own aggregation handling,
maybe the same way as it was done for carl9170, which had the same problem:

http://thread.gmane.org/gmane.linux.kernel.wireless.general/100793/focus=1405

I prefer to have solutions (if one is known) instead of another workaround.
If I use my device as STA instead of an AP, it even works fine w/o
Stanislaws patch. Do you understand what I'm trying to say?



Thanks,
kind regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 104/173] rt2x00: Dont let mac80211 send a BAR when an AMPDU subframe fails

2012-12-29 Thread Andreas Hartmann
Ben Hutchings wrote:
> 3.2-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Andreas Hartmann 
> 
> commit be03d4a45c09ee5100d3aaaedd087f19bc20d01f upstream.

[...]

This patch is a workaround for

mac80211: retry sending failed BAR frames later instead of tearing down
aggr (http://www.spinics.net/lists/linux-wireless/msg76379.html -
f0425beda4d404a6e751439b562100b902ba9c98)
See:
http://thread.gmane.org/gmane.linux.kernel.wireless.general/83297/focus=83304


Meanwhile there was a bug report complaining about problems with
be03d4a45 when used as STA:
http://thread.gmane.org/gmane.linux.drivers.rt2x00.user/1257
You can find there a few other workaround proposals.


Stanislaw Gruszka proposed here a final(?) workaround, which refines
workaround be03d4a45c by shrinking it to AP function:
http://thread.gmane.org/gmane.linux.kernel.wireless.general/100793


carl9170 had the same problem with f0425beda. There it was fixed like
this:
http://thread.gmane.org/gmane.linux.kernel.wireless.general/100793/focus=1405
This approach fixes the real problem (no aggregation handling by the
firmware / hardware) by implementing it into the driver.

Unfortunately, I didn't see any implementation of c9122c0d63a50 for
rt2x00 until now.



Kind regards,
Andreas Hartmann
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] pci: ACS quirk for AMD southbridge

2012-07-12 Thread Andreas Hartmann
Hello Alex,

I tested the patch below against linux 3.4.4 and with this
PCI WLAN-device:

06:07.0 Network controller: Ralink corp. RT2800 802.11n PCI
06:07.0 0280: 1814:0601

The device resides behind a PCI to PCI bridge:

00:14.4 PCI bridge: Advanced Micro Devices [AMD] nee ATI SBx00 PCI to PCI 
Bridge (rev 40) (prog-if 01 [Subtractive decode])
00:14.4 0604: 1002:4384 (rev 40) (prog-if 01 [Subtractive decode])

The device works fine in kvm / 64bit. Surprisingly, it isn't necessary
at all to put the PCI to PCI bridge to the VM. It's enough to put the
WLAN-device to the VM and bind it to vfio-pci. That's all. The bridge
isn't bound to vfio-pci (it's bound to nothing).

I stripped off linux-pci because I'm no member of this list.


Thanks.
kind regards,
Andreas


Alex Williamson wrote:
> We've confirmed that peer-to-peer between these devices is
> not possible.  We can therefore claim that they support a
> subset of ACS.
> 
> Signed-off-by: Alex Williamson 
Tested-by: Andreas Hartmann 
> Cc: Joerg Roedel 
> ---
> 
> Two things about this patch make me a little nervous.  The
> first is that I'd really like to have a pci_is_pcie() test
> in pci_mf_no_p2p_acs_enabled(), but these devices don't
> have a PCIe capability.  That means that if there was a
> topology where these devices sit on a legacy PCI bus,
> we incorrectly return that we're ACS safe here.  That leads
> to my second problem, pciids seems to suggest that some of
> these functions have been around for a while.  Is it just
> this package that's peer-to-peer safe, or is it safe to
> assume that any previous assembly of these functions is
> also p2p safe.  Maybe we need to factor in device revs if
> that uniquely identifies this package?
> 
> Looks like another useful device to potentially quirk
> would be:
> 
> 00:15.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB700/SB800/SB900 
> PCI to PCI bridge (PCIE port 0)
> 00:15.1 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB700/SB800/SB900 
> PCI to PCI bridge (PCIE port 1)
> 00:15.2 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB900 PCI to PCI 
> bridge (PCIE port 2)
> 00:15.3 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB900 PCI to PCI 
> bridge (PCIE port 3)
> 
> 00:15.0 0604: 1002:43a0
> 00:15.1 0604: 1002:43a1
> 00:15.2 0604: 1002:43a2
> 00:15.3 0604: 1002:43a3
> 
>  drivers/pci/quirks.c |   29 +
>  1 file changed, 29 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 4ebc865..2c84961 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3271,11 +3271,40 @@ struct pci_dev *pci_get_dma_source(struct pci_dev 
> *dev)
>   return pci_dev_get(dev);
>  }
>  
> +/*
> + * Multifunction devices that do not support peer-to-peer between
> + * functions can claim to support a subset of ACS.  Such devices
> + * effectively enable request redirect (RR) and completion redirect (CR)
> + * since all transactions are redirected to the upstream root complex.
> + */
> +static int pci_mf_no_p2p_acs_enabled(struct pci_dev *dev, u16 acs_flags)
> +{
> + if (!dev->multifunction)
> + return -ENODEV;
> +
> + /* Filter out flags not applicable to multifunction */
> + acs_flags &= (PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC | PCI_ACS_DT);
> +
> + return acs_flags & ~(PCI_ACS_RR | PCI_ACS_CR) ? 0 : 1;
> +}
> +
>  static const struct pci_dev_acs_enabled {
>   u16 vendor;
>   u16 device;
>   int (*acs_enabled)(struct pci_dev *dev, u16 acs_flags);
>  } pci_dev_acs_enabled[] = {
> + /*
> +  * AMD/ATI multifunction southbridge devices.  AMD has confirmed
> +  * that peer-to-peer between these devices is not possible, so
> +  * they do support a subset of ACS even though the capability is
> +  * not exposed in config space.
> +  */
> + { PCI_VENDOR_ID_ATI, 0x4385, pci_mf_no_p2p_acs_enabled },
> + { PCI_VENDOR_ID_ATI, 0x439c, pci_mf_no_p2p_acs_enabled },
> + { PCI_VENDOR_ID_ATI, 0x4383, pci_mf_no_p2p_acs_enabled },
> + { PCI_VENDOR_ID_ATI, 0x439d, pci_mf_no_p2p_acs_enabled },
> + { PCI_VENDOR_ID_ATI, 0x4384, pci_mf_no_p2p_acs_enabled },
> + { PCI_VENDOR_ID_ATI, 0x4399, pci_mf_no_p2p_acs_enabled },
>   { 0 }
>  };
>  
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


crash with linux 2.6.16 under high network traffic

2007-06-06 Thread Andreas Hartmann
lete 72373, find
23938/34427, race 0+2
Jun  6 13:15:36 pscudb01 kernel: Free swap  = 4170160kB
Jun  6 13:15:36 pscudb01 kernel: Total swap = 4200956kB
Jun  6 13:15:36 pscudb01 kernel: Free swap:   4170160kB
Jun  6 13:15:36 pscudb01 kernel: 6291456 pages of RAM
Jun  6 13:15:36 pscudb01 kernel: 214414 reserved pages
Jun  6 13:15:36 pscudb01 kernel: 28836 pages shared
Jun  6 13:15:36 pscudb01 kernel: 1209 pages swap cached


Sometimes, the oom-killer gets active too, before the machine crashes.


Does anybody has any idea, what to do to narrow down this problem? How can
I see how much memory the network driver module needs?

Background:
I'm suspecting the cassini driver to be the problem (memory leak?),
because I didn't have this problem without the cassini driver while using
another nic and driver.




Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.18.2] ide_core bug: kobject_add failed for ide ... - with vanilla kernel

2007-01-08 Thread Andreas Hartmann
Hello Lee,

Lee Revell wrote:
> On Sun, 2007-01-07 at 18:44 +0100, Andreas Hartmann wrote:
>> Hello,
>> 
>> ide_core is loaded (while putting in an USB stick) as module the first
>> time after reboot - all works fine. The USB stick got mounted and a ls
>> is done to show the files on the root of the filesystem of the stick.
>> Afterwards, the stick is securely removed from the system.
>> Afterwards, ide_core is unloaded with rmmod (after usb-storage has been
>> unloaded) - ok.
>> 
>> Next step is to load ide_core again. Now, the following error can be
>> found in /var/log/messages:
>> 
>> 
>> Jan  7 11:48:18 notebook1 kernel: Uniform Multi-Platform E-IDE driver
>> Revision: 7.00alpha2
>> Jan  7 11:48:18 notebook1 kernel: ide: Assuming 33MHz system bus speed
>> for PIO modes; override with idebus=xx
>> Jan  7 11:48:18 notebook1 kernel: kobject_add failed for ide with
>> -EEXIST, don't try to register things with the same name in the same
>> directory.
> 
> You seem to be running a SuSE kernel - please report the issue to them.

You are right - but the same error appears with the vanilla kernel, too.
That's why I reported it here.

> It's probably useful to repeat your test but run "find /sys/module >
> sys1" before loading ide_core the first time, then "find /sys/module >
> sys2" after "rmmod ide_core", and save the output of "diff sys1 sys2".

There isn't any difference.


Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6.18.2] ide_core bug: kobject_add failed for ide ...

2007-01-07 Thread Andreas Hartmann
ation 82801FBM (ICH6M) LPC Interface
Bridge (rev 04)
Subsystem: Mitac Unknown device 8048
Flags: bus master, medium devsel, latency 0

00:1f.2 IDE interface: Intel Corporation 82801FBM (ICH6M) SATA
Controller (rev 04) (prog-if 80 [Master])
Subsystem: Mitac Unknown device 8048
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 177
I/O ports at 
I/O ports at 
I/O ports at 
I/O ports at 
I/O ports at 1100 [size=16]
Capabilities: [70] Power Management version 2

00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
SMBus Controller (rev 04)
Subsystem: Mitac Unknown device 8048
Flags: medium devsel, IRQ 177
I/O ports at 1400 [size=32]

01:02.0 CardBus bridge: Texas Instruments PCIxx21/x515 Cardbus Controller
Subsystem: Mitac Unknown device 8048
Flags: bus master, medium devsel, latency 168, IRQ 169
Memory at cc009000 (32-bit, non-prefetchable) [size=4K]
Bus: primary=01, secondary=02, subordinate=05, sec-latency=176
Memory window 0: 9c00-9dfff000 (prefetchable)
Memory window 1: ce00-c000
I/O window 0: c400-c4ff
I/O window 1: c800-c8ff
16-bit legacy interface ports at 0001

01:02.2 FireWire (IEEE 1394): Texas Instruments OHCI Compliant IEEE 1394
Host Controller (prog-if 10 [OHCI])
Subsystem: Mitac Unknown device 8048
Flags: bus master, medium devsel, latency 128, IRQ 177
Memory at fedff800 (32-bit, non-prefetchable) [size=2K]
Memory at cc00c000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [44] Power Management version 2

01:04.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
Subsystem: Mitac Unknown device 8048
Flags: bus master, medium devsel, latency 128, IRQ 209
I/O ports at c000 [size=256]
Memory at cc008000 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2

01:05.0 Network controller: RaLink RT2561/RT61 rev B 802.11g
Subsystem: Micro-Star International Co., Ltd. Unknown device b833
Flags: bus master, slow devsel, latency 128, IRQ 11
Memory at cc00 (32-bit, non-prefetchable) [size=32K]
Capabilities: [40] Power Management version 2

notebook1:~ # lsmod
Module  Size  Used by
ide_core  129992  0
af_packet  29320  2
ipv6  263584  12
button 10896  0
battery14340  0
ac  9476  0
twofish47488  3
cryptoloop  7680  3
ohci_hcd   23428  0
apparmor   55572  0
aamatch_pcre   18304  1 apparmor
loop   20488  7 cryptoloop
dm_mod 60184  13
pcmcia 40892  0
firmware_class 14080  1 pcmcia
usbhid 52192  0
yenta_socket   30348  1
ohci1394   37040  0
rsrc_nonstatic 17024  1 yenta_socket
pcmcia_core43412  3 pcmcia,yenta_socket,rsrc_nonstatic
ieee1394  102584  1 ohci1394
snd_hda_intel  23060  1
snd_hda_codec 164352  1 snd_hda_intel
snd_pcm86916  2 snd_hda_intel,snd_hda_codec
snd_timer  27908  1 snd_pcm
snd61188  6
snd_hda_intel,snd_hda_codec,snd_pcm,snd_timer
8139too30592  0
soundcore  13792  1 snd
intel_agp  27804  1
snd_page_alloc 14472  2 snd_hda_intel,snd_pcm
mii 9600  1 8139too
agpgart35528  2 intel_agp
ehci_hcd   34696  0
uhci_hcd   26892  0
i2c_i801   11660  0
usbcore   114896  4 ohci_hcd,usbhid,ehci_hcd,uhci_hcd
i2c_core   25216  1 i2c_i801
reiserfs  237312  7
sr_mod 20132  0
cdrom  38432  1 sr_mod
edd13892  0
fan 8964  1
sg 38044  0
ata_piix   19332  3
ahci   25860  0
libata119188  2 ata_piix,ahci
thermal18568  1
processor  34664  1 thermal
sd_mod 24576  4
scsi_mod  136712  5 sr_mod,sg,ahci,libata,sd_mod



Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6.18.2] ide_core oops

2007-01-07 Thread Andreas Hartmann
nt version 2

01:04.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
Subsystem: Mitac Unknown device 8048
Flags: bus master, medium devsel, latency 128, IRQ 209
I/O ports at c000 [size=256]
Memory at cc008000 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2

01:05.0 Network controller: RaLink RT2561/RT61 rev B 802.11g
Subsystem: Micro-Star International Co., Ltd. Unknown device b833
Flags: bus master, slow devsel, latency 128, IRQ 11
Memory at cc00 (32-bit, non-prefetchable) [size=32K]
Capabilities: [40] Power Management version 2


lsmod (when there is no error)
button 10896  0
usb_storage82112  0
ide_core  129992  1 usb_storage
ohci_hcd   23428  0
uhci_hcd   26892  0
ohci1394   37040  0
ieee1394  102584  1 ohci1394
nls_iso8859_1   8320  0
nls_cp437   9984  0
vfat   16640  0
fat55324  1 vfat
af_packet  29320  2
ipv6  263584  16
battery14340  0
ac  9476  0
twofish47488  3
cryptoloop  7680  3
apparmor   55572  0
aamatch_pcre   18304  1 apparmor
loop   20488  7 cryptoloop
dm_mod 60184  13
pcmcia 40892  0
firmware_class 14080  1 pcmcia
usbhid 52192  0
snd_hda_intel  23060  1
snd_hda_codec 164352  1 snd_hda_intel
snd_pcm86916  2 snd_hda_intel,snd_hda_codec
8139too30592  0
yenta_socket   30348  1
snd_timer  27908  1 snd_pcm
rsrc_nonstatic 17024  1 yenta_socket
snd61188  6
snd_hda_intel,snd_hda_codec,snd_pcm,snd_timer
mii 9600  1 8139too
pcmcia_core43412  3 pcmcia,yenta_socket,rsrc_nonstatic
soundcore  13792  1 snd
ehci_hcd   34696  0
snd_page_alloc 14472  2 snd_hda_intel,snd_pcm
usbcore   114896  5
usb_storage,ohci_hcd,uhci_hcd,usbhid,ehci_hcd
i2c_i801   11660  0
intel_agp  27804  1
agpgart35528  3 intel_agp
i2c_core   25216  1 i2c_i801
reiserfs  237312  7
sr_mod 20132  0
cdrom  38432  1 sr_mod
edd13892  0
fan 8964  1
sg 38044  0
ata_piix   19332  3
ahci   25860  0
libata119188  2 ata_piix,ahci
thermal18568  1
processor  34664  1 thermal
sd_mod 24576  4
scsi_mod  136712  6 usb_storage,sr_mod,sg,ahci,libata,sd_mod


Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: forbid to strace a program

2005-09-04 Thread Andreas Hartmann
Chase Venters wrote:
>> Is there another way to do this? If the password is crypted, I need a
>> passphrase or something other to decrypt it again. Not really a solution
>> of the problem.
>>
>> Therefore, it would be best, to hide it by preventing stracing of the
>> application to all users and root.
>>
>> Ok, root could search for the password directly in the memory, but this
>> would be not as easy as a strace.
> 
> Obfuscation isn't really valid security. Making something 'harder' to break 
> isn't a solution unless you're making it hard enough that current technology 
> can't break it (eg... you always have the brute force option, but good crypto 
> intends to make such an option impossible without expending zillions of clock 
> cycles). 

You're right. If I would have a solution, which could do this, I would
prefer it.

> 
> Can I ask why you want to hide the database password from root?

It's easy: for security reasons. There could always be some bugs in some
software, which makes it possible for some other user, to gain root
privileges. Now, they could easily strace for information, they shouldn't
could do it. The password they could see, isn't just used for the DB, but
for some other applications, too. That's the disadvantage of general
(single sign on) passwords.


Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: forbid to strace a program

2005-09-03 Thread Andreas Hartmann
Alex Riesen wrote:
> On 9/3/05, Andreas Hartmann <[EMAIL PROTECTED]> wrote:
>> Hello!
>> 
>> Is it possible to prevent a program to be straced on x86?
>> What do I have to do, eg., to prevent a perl-program to be straced?
>> 
> 
> So that none can see what are you doing? Or because your program is
> breaking because of this? Probably nothing, but someone would like
> to know what it is you are doing and exactly how it breaks (and, if
> you don't mind -
> why it breaks).

That's not really the problem. I want to hide a clear text password in
that program (something like ssh-agent or gpg-agent; the last can be
straced, too :-() which I need for a database when the program runs.

Is there another way to do this? If the password is crypted, I need a
passphrase or something other to decrypt it again. Not really a solution
of the problem.

Therefore, it would be best, to hide it by preventing stracing of the
application to all users and root.

Ok, root could search for the password directly in the memory, but this
would be not as easy as a strace.



Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More performance for the TCP stack by using additional hardware chip on NIC

2005-04-17 Thread Andreas Hartmann
Willy Tarreau schrieb:
> Hello !
> 
> On Sun, Apr 17, 2005 at 01:29:14PM +0300, Avi Kivity wrote:
>> On Sun, 2005-04-17 at 12:07, Arjan van de Ven wrote:
>> > On Sun, 2005-04-17 at 10:17 +0200, Andreas Hartmann wrote:
>> > > Hello!
>> > > 
>> > > Alacritech developed a new chip for NIC's
>> > > (http://www.alacritech.com/html/tech_review.html), which makes it 
>> > > possible
>> > > to take away the TCP stack from the host CPU. Therefore, the host CPU has
>> > > more performance for the applications according Alacritech.
>> > 
>> > there are very many good reasons why this for linux is not the right
>> > solution, including the fact that the linux tcp/ip stack already is
>> > quite fast so the "gains" achieved aren't that stellar as the gains you
>> > get when comparing to windows.
>> > 
>> 
>> TOEs can remove the data copy on receive. In some applications (notably
>> storage), where the application does not touch most of the data, this is
>> a significant advantage that cannot be achieved in a software-only
>> solution.
> 
> Well, if the application does not touch most of the data, either it
> is playing as a relay, and the data will at least have to be copied,
> or it will play as a client or server which reads from/writes to disk,
> and in this case, I wonder how the NIC will send its writes directly
> to the disk controller without some help.
> 
> What worries me with those NICs is that you have no control on the
> TCP stack. You often have to disable the acceleration when you
> want to insert even 1 firewall rule, use policy routing or even
> do a simple anti-spoofing check. It is exactly like the routers
> which do many things in hardware at wire speed, but jump to snail
> speed when you enable any advanced feature.
> 
>> > Also these types of solution always add quite a bit of overhead to
>> > connection setup/teardown making it actually a *loss* for the "many
>> > short connections" types of workloads. Now guess which things certain
>> > benchmarks use, and guess what real world servers do :)
>> > 
>> 
>> again, this depends on the application.
> 
> The speed itself depends on the application. An application which
> goal is to achieve 10 Gbps needs to be written with this goal in
> mind from start, and needs fine usage of the kernel internals, and
> even sometimes good knowledge of the hardware itself.

Alacritech says, the hardware solution would make it very easy for the
application, because _every_ application would gain, without considering
the hardware it runs on itself. These are things which CEO's like to hear
- because they think, they could save time and money during development of
the application.


I don't think that it must be a problem, that on the hardware TCP stack
doesn't run any filter or other additional functions, because machines
(often clusters) with high workloads usually run on dedicated servers with
other dedicated firewall machines in front of.


I think it would be good to support this hardware, because the user can
decide afterwards (after testing), which is the best choice for his
specific application and workload.



Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


More performance for the TCP stack by using additional hardware chip on NIC

2005-04-17 Thread Andreas Hartmann
Hello!

Alacritech developed a new chip for NIC's
(http://www.alacritech.com/html/tech_review.html), which makes it possible
to take away the TCP stack from the host CPU. Therefore, the host CPU has
more performance for the applications according Alacritech.

This sounds interesting.

Unfortunately, there are two patents belonging to this solution.

Now, I'm wondering if it is possible to implement any support for these
chips in the Linux kernel. If this hardware solution does have really the
advantages described by Alacritech, it would be a pitty, if Linux couldn't
use this hardware.

What do you think about that?



Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


crypting filesystems

2005-04-04 Thread Andreas Hartmann
Hello,

I want to crypt some filesystems (/var, /home, /Data). I'm running LVM I
on all these partitions yet.

I searched, how to do this with linux and found 3 ways to achieve, what I
want to do.

1. crypto-loop (with kernel 2.6)
2. loop-AES (with kernel 2.2.x, 2.4.x and 2.6.x)
3. dm-crypt (with kernel 2.6.x)

Because I'm new to filesystem encryption, I searched for documentation of
all of these solutions and found, that crypto-loop seems not to be
maintained any more. loop-AES and dm-crypt remained. dm-crypt uses the
device mapper concept, which I know long ago from LVM and which therefore
seems to be the most logical solution to me. There is no need to patch the
mount-utility and integration is "out of the box".

So, I suggested to use dm-crypt with 2.6.11.6. I built 3 partitions with
cryptsetup (LUKS) with ESSIV-cipher and 256Bit keys on top of LVM 1 and
reiserfs as filesystem. The swap-partition is crypted with a random key,
which is generated each time at booting.

After all, there are remaining some questions open concerning the security
 / stability of this solution.

1. In order to put in the passphrase just once a time at booting, I put
the passphrase in a gpg-crypted file (cipher AES256 and 256Bit key size),
which is decrypted at boot-time to /tmp (-> tmpfs) and immediately removed
with shred, after activating the three partitions. Is it possible to see
the cleartext password after this action in tmpfs?

2. Is it possible to gain the passphrase from the active encrypted
partitions (because the passphrase is somewhere held in the RAM)?

3. I read at clemens.endorphin.org about 4 different cipher modes (CBC,
CMC, EME and LRW). Actually implemented in dm-crypt is the public-IV
on-disk format or ESSIV, both using CBC cipher mode. The other cipher
modes (CMC, EWE, LRW) are not implemented yet although they promise more
security.

My question is:
Was anybody able to decrypt one of these two implemented public-IV on-disk
formats, or, to say it in other words: are the known problems a mainly
theoretical discussion until today?

4. Are there any master keys existing, which could be used to open every
encrypted filesystem?

5. I read about problems (corrupted filesystem) with reiserfs (I'm using V
3.6). Are they fixed in 2.6.11.6? Would it be better to use XFS?



I would be very glad, if somebody could give me some advice.


Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.4.x oops with X

2005-02-05 Thread Andreas Hartmann
Andreas Hartmann wrote:
> Andreas Hartmann wrote:
> [...]
>> But now, the question is:
>> Why does X crash running kernel 2.4.x with glibc 2.3.4 and not with kernel
>> 2.6.10? Why does X run fine using kernel 2.4 and 2.6 with glibc 2.3.3?
>> 
>> --
>>  |   glibc
>>  |   2.3.3   2.3.4
>> --|-
>> kernel|
>> 2.4  |   X okX segfaults
>> 2.6  |   X okX ok
> 
> 
> Meanwhile, I could find where X crashes using glibc 2.3.4 with kernel 2.4.
> It's this piece of code in linux_vm86.c:267
> 
> static int
> vm86_rep(struct vm86_struct *ptr)
> {
> int __res;
> 
> #ifdef __PIC__
> /* When compiling with -fPIC, we can't use asm constraint "b" because
>%ebx is already taken by gcc. */
> __asm__ __volatile__("pushl %%ebx\n\t"
>  "movl %2,%%ebx\n\t"
>  "movl %1,%%eax\n\t"
>  "int $0x80\n\t"
>  "popl %%ebx"
>  :"=a" (__res)
>  :"n" ((int)113), "r" ((struct vm86_struct *)ptr));
> #else
> __asm__ __volatile__("int $0x80\n\t"
>  :"=a" (__res):"a" ((int)113),
>  "b" ((struct vm86_struct *)ptr));
> #endif
> 
> if (__res < 0) {
> errno = -__res;
> __res = -1;
> }
> else errno = 0;
> return __res;
> }
> 
> 
> The function ExecX86int10 (vbe.c) calls do_vm86 (linux_vm86.c), which
> calls vm86_rep (linux_vm86.c).
> 
> 
> I don't understand, why this piece of assembler code works fine with glibc
> 2.3.3, but not with glibc 2.3.4, running kernel 2.4.x. It works fine again
> with kernel 2.6.

Solution for this problem can be found meanwhile at
https://bugs.freedesktop.org/show_bug.cgi?id=2431


Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Software Suspend for 2.4 Final Release

2005-01-30 Thread Andreas Hartmann
Nigel Cunningham schrieb:
> Hi everyone.
> 
> SoftwareSuspend 2.1.5.7B for the 2.4.28 kernel is now available from
> softwaresuspend.berlios.de.
> 
> Bug fixes and forward ports to 2.4.29 and later kernels notwithstanding,
> it is intended to be the last release of SoftwareSuspend for the 2.4
> series kernels.
> 
> The 2.4 version of Suspend is generally pretty easily to get going, but
> if you have any questions or problems, you will find lots of resources
> at softwaresuspend.berlios.de. In particular, there are HOWTOs, FAQs,
> and a Wiki that you can consult before asking on the mailing lists
> you'll also find there.
> 
> Fuller instructions regarding applying the package can be found in the
> README file, included in the package.
> 
> Nigel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Software Suspend for 2.4 Final Release

2005-01-30 Thread Andreas Hartmann
Nigel Cunningham schrieb:
> Hi everyone.
> 
> SoftwareSuspend 2.1.5.7B for the 2.4.28 kernel is now available from
> softwaresuspend.berlios.de.

I'm wondering why you didn't provide a patch against 2.4.29.

Anyway, I tested it against 2.4.29. I couldn't apply the preemption patch.
The other patches could be applied with a view changes. 2.1.5.7B is
working fine afterwards - even without restarting sleeping hd's during
hibernation! Thank you very much for fixing this problem!



Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.10 dies when X uses PCI radeon 9200 SE, binary search result

2005-01-22 Thread Andreas Hartmann
Helge Hafting schrieb:
> On Fri, Jan 21, 2005 at 09:05:12PM +0100, Andreas Hartmann wrote:
>> Hello Helge,
>> 
>> Helge Hafting schrieb:
>> > On Sun, Jan 16, 2005 at 10:41:23PM +1100, Dave Airlie wrote:
>> >> > 
>> >> > I'm fine with adding this code, but we still don't know if this is the
>> >> > cause of his problem. The debug output can determine if this really is
>> >> > the source of the problem or if it is somewhere else.
>> >> > 
>> >> 
>> >> I actually doubt it is this stuff.. my guess is that it is something
>> >> nasty like ACPI breaking int10 for X or something like that... it
>> >> seems a lot more subtle than the usually things that break when we
>> >> mess with the DRM :-)
>> 
>> Which glibc do you use? I have problems with glibc 2.3.4, kernel 2.4.x and
>> X / Xorg while executing the int10-code of X. glibc 2.3.3 works fine for
>> me. But I could find another posting, which describes, that there are even
>> problems with glibc 2.3.3 and kernel 2.4.x.
>> 
>> It's new for me, that there could be problems with kernelversions of 2.6, 
>> too.
>> 
>> Therefore, it would be really interessting to know, which glibc version
>> you are using.
>> 
> I use glibc 2.3.2 from debian testing (or unstable).  
> This is not the problem though, because a reboot into 2.6.8.1 makes
> X work without crashing.  The crash only happens with 2.6.9-rc2
> or later kernels.

Did you try another version of glibc?

> So the only way glibc could be the culprit, is if the newer kernel
> exports some new interface that this glibc manages to mess up.  Still,
> even a buggy glibc shouldn't hang the kernel anyway.

That's certainly correct.

> Such issues
> could crash (all) user apps, but shouldn't prevent the machine from
> responding to sysrq sequences.

You emphasized the differences of the effects. But there is one reason in
all cases which I know: int10 crashes X or even the whole kernel.

I could debug the problem to the following point:

--
static int
vm86_rep(struct vm86_struct *ptr)
{
int __res;

#ifdef __PIC__
/* When compiling with -fPIC, we can't use asm constraint "b" because
   %ebx is already taken by gcc. */
__asm__ __volatile__("pushl %%ebx\n\t"
 "movl %2,%%ebx\n\t"
 "movl %1,%%eax\n\t"
 "int $0x80\n\t"
 "popl %%ebx"
 :"=a" (__res)
 :"n" ((int)113), "r" ((struct vm86_struct *)ptr));
#else
__asm__ __volatile__("int $0x80\n\t"
 :"=a" (__res):"a" ((int)113),
 "b" ((struct vm86_struct *)ptr));
#endif
/* Comment from me */
xf86MsgVerb(X_INFO,3,"my comment\n");
if (__res < 0) {
errno = -__res;
__res = -1;
}
else errno = 0;
return __res;
}

#endif
---

I could see, that X crashes in glibc 2.3.4 with kernel 2.4.x (not with
kernel 2.6.x, x <= 10, x > 10 not tested) during the first malloc syscall
after int10 to execute the function
xf86MsgVerb(X_INFO,3,"my comment\n");


The crashes depend on different versions of used software:

glibc 2.3.3 or 2.3.4 with kernel 2.4.x
glibc 2.3.2 with kernel > 2.6.9rc2

I asked a X developper, but he couldn't help until now, too.


I can't say, if glibc or the kernel could be the problem. You can't relate
it reliable neither to glibc nor to the kernel nor to X. Therefore, it
_seems_ to me, nobody really cares about the problem.

I'm willing to help to find the problem - but I'm neither a kernel
developper, nor a glibc developper nor a X developper. I'm depending on
the support of the developpers.

I think, there should work one developper of each application together to
find the problem. I could ask a X developper, which I know, if he is
willing to help to find the problem together with a developper from the
kernel and from the glibc (I don't know, who to ask from the glibc-team).


Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.10 dies when X uses PCI radeon 9200 SE, binary search result

2005-01-21 Thread Andreas Hartmann
Hello Helge,

Helge Hafting schrieb:
> On Sun, Jan 16, 2005 at 10:41:23PM +1100, Dave Airlie wrote:
>> > 
>> > I'm fine with adding this code, but we still don't know if this is the
>> > cause of his problem. The debug output can determine if this really is
>> > the source of the problem or if it is somewhere else.
>> > 
>> 
>> I actually doubt it is this stuff.. my guess is that it is something
>> nasty like ACPI breaking int10 for X or something like that... it
>> seems a lot more subtle than the usually things that break when we
>> mess with the DRM :-)

Which glibc do you use? I have problems with glibc 2.3.4, kernel 2.4.x and
X / Xorg while executing the int10-code of X. glibc 2.3.3 works fine for
me. But I could find another posting, which describes, that there are even
problems with glibc 2.3.3 and kernel 2.4.x.

It's new for me, that there could be problems with kernelversions of 2.6, too.

Therefore, it would be really interessting to know, which glibc version
you are using.


Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.4.5ac19] reproduceable Kernel crashes

2001-06-30 Thread Andreas Hartmann

Hello all!


I have a VIA 686A-board (EP7KXA) with 512 MB RAM. If I burn a rw-cd with the 
Philips CDD3610, the whole machine suddenly freezes.

The last messages of cdrecord is:
Track 01: 234 of 645 MB written (fifo 100%).  

ksymoops crash
ksymoops 2.4.1 on i686 2.4.5-ac19.  Options used
 -V (default)
 -k /proc/ksyms (default)
 -l /proc/modules (default)
 -o /lib/modules/2.4.5-ac19/ (default)
 -m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.
 
Warning (compare_maps): mismatch on symbol unix_socket_table  , unix says 
e08b11e0, /lib/modules/2.4.5-ac19/kernel/net/unix/unix.o says e08b0e40.  
Ignoring /lib/modules/2.4.5-ac19/kernel/net/unix/unix.o entry
Oops: 0002
CPU:  0
EIP:  0010:[<080559b2>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00016
Process syslogd (pid 79, stackpage=dbc89000)
e04c45d 080559a0 daae8000 df898200 080559a0 c0257ea4 dbc89e5c  
dbcfe3c0 df898200 c0257ea4 e0a4c535 0001 c1890180 c0257ea4 c1890180
0282 c0257d30 0002 c0257e51 c0185e6f c0257ea4 c1888c40 0401
Call Trace:  [] [] [] [] [] 
[] [] [] [] [] [] 
[]
Code: Bad EIP value
 
>>EIP; 080559b2 Before first symbol   <=
Trace; e0a4c45d <[unix].bss.end+19ae76/382a79>
Trace; e0a4c535 <[unix].bss.end+19af4e/382a79>
Trace; c0185e6f 
Trace; e0a4c4b0 <[unix].bss.end+19aec9/382a79>
Trace; c0107f41 
Trace; c01080ba 
Trace; c010a1de 
Trace; c0125606 
Trace; c012f2f2 
Trace; c01251d0 
Trace; c012f411 
Trace; c0106bff 
 
<0> Kernel panic: Aiee, killing interrupt handler!
 
2 warnings issued.  Results may not be reliable.

-
The oops from screen:
Oops: 0002
CPU:0
EIP:0010:[<080559b2>]
EFLAGS: 00016
eax:080559a1 ebx:0246 ecx:4800 edx:dbcfe3c0
esi:df898200 edi:080559a0 ebp:daae8000 esp:dbc89e08
Process syslogd (pid 79, stackpage=dbc89000)
Stack
e04c45d 080559a0 daae8000 df898200 080559a0 c0257ea4 dbc89e5c  
dbcfe3c0 df898200 c0257ea4 e0a4c535 0001 c1890180 c0257ea4 c1890180
0282 c0257d30 0002 c0257e51 c0185e6f c0257ea4 c1888c40 0401
Call Trace:  [] [] [] [] [] 
[] [] [] [] [] [] 
[]
Code: Bad EIP value
<0> Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing
-

This problem is onely with one cd - all other cd's are working fine.

If I burn this cd with 2.2.19, the burning process is interrupted too, but 
the machine doesn't freeze and I get the following error:

-
cdrecord: input-/outputerror. write_g1: scsi sendcmd: no error
CDB:  2A 00 00 01 D7 6B 00 00 1F 00
status: 0x2 (CHECK CONDITION)
Sense Bytes: F0 00 06 00 00 00 00 19 00 02 D6 11 29 00 00 00
Sense Key: 0x6 Unit Attention, Segment 0
Sense Code: 0x29 Qual 0x00 (power on, reset, or bus device reset occurred) 
Fru 0x0
Sense flags: Blk 0 (valid)
cmd finished after 58.878s timeout 40s
write track data: error after 247158784 bytes
Sense Bytes: F0 00 00 00 00 00 00 19 00 00 00 61 00 00 00 00 00 00
cdrecord: input-/outputerror. flush cache: scsi sendcmd: no error
CDB:  35 00 00 00 00 00 00 00 00 00
status: 0x2 (CHECK CONDITION)
Sense Bytes: F0 00 02 00 00 00 00 19 00 00 00 62 04 01 00 00
Sense Key: 0x2 Not Ready, Segment 0
Sense Code: 0x04 Qual 0x01 (logical unit is in process of becoming ready) Fru 
0x0
Sense flags: Blk 0 (valid)
cmd finished after 0.005s timeout 120s
Trouble flushing the cache
--

If I do a blank=all (2.2.19) I get the following error:

--
cdrecord: input-/outputerror. blank unit: scsi sendcmd: no error
CDB:  A1 00 00 00 00 00 00 00 00 00 00 00
status: 0x2 (CHECK CONDITION)
Sense Bytes: F0 00 05 00 00 00 00 19 00 02 EA 48 A1 10 00 80
Sense Key: 0x5 Illegal Request, Segment 0
Sense Code: 0xA1 Qual 0x10 (vendor unique sense code 0xA1) [No matching 
qualifier] Fru 0x0
Sense flags: Blk 0 (valid) error refers to data part, bit ptr 0 (not valid) 
field ptr 0
cmd finished after 900.423s timeout 9600s
cdrecord: Cannot blank disk, aborting.
-------

Reagrds
Andreas Hartmann



lspci -v (2.4.5ac19)

00:00.0 Host bridge: VIA Technologies, Inc. VT8371 [KX133] (rev 02)
Flags: bus master, medium devsel, lat

[2.4.5] Mysterious behaviour of pppd at 56K modem

2001-06-05 Thread Andreas Hartmann

Hallo all!

I detected a very mysterious behaviour with my serial connected 56K modem. If 
you do a ftp-download e.g., the datas come at the following way:

5,9 kB   -- -- ---  -- -- -- --  --


4,4 kB  -  -  -   --  -  -  -   -

The speed of the incoming data is always swinging between 5.9kB and 4.4kB. 
Why? I didn't have this problem with Kernel 2.2.x (with the same 
pppd-versions).
Neverthless, the overallspeed seems to be equal to kernel 2.2.x (about 
5.1kB/s) - but not slower; it even could be faster. But I think, the speed 
could be much higher, if it wouldn't swing as much.

I'm using pppd 2.4.0b or 2.4.1. My modem (USR Sportster Message +) is 
connected with 115200 Baud (56000 tested but doesn't work properly), the 
connect to my provider is 50,6kB/s.
My serial hardware is
ttyS00 at 0x03f8 (irq = 4) is a 16550A
on a AMD K6 2 400 with ALI 1541-chipset.

I can't see any errors in messages or with ifconfig.
I tested it with or without firewall - always the same behaviour.

The login-protocoll seems not to bee suspicious:
Jun  5 08:55:23 kernel: CSLIP: code copyright 1989 Regents of the University 
of California
Jun  5 08:55:23 kernel: PPP generic driver version 2.4.1
Jun  5 08:55:23 pppd[1559]: pppd 2.4.1 started by ausgang, uid 1003
Jun  5 08:55:23 pppd[1559]: using channel 1
Jun  5 08:55:23 pppd[1559]: Using interface ppp0
Jun  5 08:55:23 pppd[1559]: Connect: ppp0 <--> /dev/ttyS0
Jun  5 08:55:23 pppd[1559]: sent [LCP ConfReq id=0x1]
Jun  5 08:55:25 last message repeated 2 times
Jun  5 08:55:25 pppd[1559]: rcvd [LCP ConfAck id=0x1]
Jun  5 08:55:26 pppd[1559]: sent [LCP ConfReq id=0x1]
Jun  5 08:55:26 pppd[1559]: rcvd [LCP ConfReq id=0xd6  
   ]
Jun  5 08:55:26 pppd[1559]: sent [LCP ConfAck id=0xd6  
   ]
Jun  5 08:55:26 pppd[1559]: rcvd [LCP ConfAck id=0x1]
Jun  5 08:55:26 pppd[1559]: sent [PAP AuthReq id=0x1 user="somebody" 
password=]
Jun  5 08:55:26 pppd[1559]: rcvd [PAP AuthAck id=0x1 ""]
Jun  5 08:55:26 pppd[1559]: sent [IPCP ConfReq id=0x1  
]
Jun  5 08:55:26 pppd[1559]: rcvd [IPCP ConfReq id=0xaa ]
Jun  5 08:55:26 pppd[1559]: sent [IPCP ConfAck id=0xaa ]
Jun  5 08:55:26 pppd[1559]: rcvd [IPCP ConfRej id=0x1 ]
Jun  5 08:55:26 pppd[1559]: sent [IPCP ConfReq id=0x2 ]
Jun  5 08:55:27 pppd[1559]: rcvd [IPCP ConfNak id=0x2 ]
Jun  5 08:55:27 pppd[1559]: sent [IPCP ConfReq id=0x3 ]
Jun  5 08:55:27 pppd[1559]: rcvd [IPCP ConfAck id=0x3 ]
Jun  5 08:55:27 pppd[1559]: local  IP address 213.7.17.225
Jun  5 08:55:27 pppd[1559]: remote IP address 62.104.220.42

I tried to switch off all softwarecompression. But it doesn't matter.


Do you have any advice for me?


Regards,
Andreas Hartmann


lspci

00:00.0 Host bridge: Acer Laboratories Inc. [ALi] M1541 (rev 04)
Subsystem: Acer Laboratories Inc. [ALi] ALI M1541 Aladdin V/V+ AGP 
System Controller
Flags: bus master, slow devsel, latency 64
Memory at e600 (32-bit, non-prefetchable) [size=16M]
Capabilities: [b0] AGP version 1.0

00:01.0 PCI bridge: Acer Laboratories Inc. [ALi] M5243 (rev 04) (prog-if 00 
[Normal decode])
Flags: bus master, slow devsel, latency 64
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: d000-dfff
Memory behind bridge: e480-e5ff
Prefetchable memory behind bridge: e7f0-e7ff
 
00:03.0 Bridge: Acer Laboratories Inc. [ALi] M7101 PMU
Subsystem: Acer Laboratories Inc. [ALi] ALI M7101 Power Management 
Controller
Flags: medium devsel
 
00:07.0 ISA bridge: Acer Laboratories Inc. [ALi] M1533 PCI to ISA Bridge 
[Aladdin IV] (rev c3)
Flags: bus master, medium devsel, latency 0
 
00:0a.0 Multimedia audio controller: Xilinx, Inc. RME Digi96
Flags: slow devsel, IRQ 12
Memory at e300 (32-bit, non-prefetchable) [size=16M]
 
00:0b.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 10/100 
Ethernet (rev 02)
Subsystem: Silicon Integrated Systems [SiS] SiS900 10/100 Ethernet 
Adapter
Flags: bus master, medium devsel, latency 32, IRQ 10
I/O ports at b800 [size=256]
Memory at e280 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at  [disabled] [size=128K]
Capabilities: [40] Power Management version 1
 
00:0f.0 IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev c1) 
(prog-if 8a [Master SecP PriP])
Flags: bus master, medium devsel, latency 32
I/O ports at b400 [size=16]
 
01:00.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro AGP 1X/2X 
(rev 5c) (prog-if 00 [VGA])
Subsystem: ATI Technologies Inc: Unknown device 0084
Flags: bus master, stepping, medium devsel, latency 64, IRQ 11
Memory at e500 (32-bit, non-prefetchable) [size=16M]
I/O ports at d800 [size=256]
Memory at e480 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at e7fe [disabled] [size=128K]

Re: [2.4.5 and all ac-Patches] massive file corruption with reiser or NFS

2001-06-02 Thread Andreas Hartmann

Am Samstag,  2. Juni 2001 20:33 schrieb Chris Mason:
> On Saturday, June 02, 2001 08:13:44 PM +0200 Andreas Hartmann
>
> <[EMAIL PROTECTED]> wrote:
> > Am Samstag,  2. Juni 2001 18:42 schrieben Sie:
> >> On Saturday, June 02, 2001 02:41:04 PM +0200 Andreas Hartmann
> >>
> >> >> <[EMAIL PROTECTED]> wrote:
> >> >
> >> > Am Samstag,  2. Juni 2001 12:52 schrieb Rasmus Bøg Hansen:
> >> >> On Sat, 2 Jun 2001, Andreas Hartmann wrote:
> >> >> > I got massive file corruptions with the kernels mentioned in the
> >> >> > subject. I can reproduce it every time.
> >> >> >
> >> >> > >> >> >> You cannot use NFS on reiserfs unless you apply the knfsd
> >> >> > >> >> >> patch.
> >> >> >>
> >> >> >> Look at
> >> >> >>
> >> >> >> >> www.namesys.com.
> >> >> >> >>
> >> >> >> > > Thank you very much for your advice.
> >> > >
> >> > > I tested your suggestion and run the machine without NFS-mounted
> >> > > devices
> >> > >
> >> > >> > - it  seems to be working fine. > > Anyway - I'm wondering why I
> >> > >> > didn't
> >> >
> >> > get any problem until 2.4.4ac10 with this  configuration without the
> >> > appropriate patch on the client or on the server?
> >> >
> >> >> The problem only happens when the clients do an operation on a file
> >> >> that
> >>
> >> has gone out of cache on the server.  Under light load, this might
> >> happen very rarely.
> >>
> > > The load didn't change. YOu can forget the load, it's very small. It's
> > > my
> >
> > private server and I'm doing always the same thing via NFS - compiling
> > e.g.  This has been working fine until 2.4.4.ac10, afterwards it has been
> > broken.
>
> Ok, there are two different problems here.  The patch you posted to l-k is
> a generic NFS fix for 2.4.5.  ext2 would need this too.
>
> If you are serving NFS from your reiserfs disk, you need an additional
> patch on the server only (this is the one I was talking about).  Checkout
> the FAQ on www.namesys.com for all the details.

Just for my understanding:
While I used 2.2.19 without patch on the server (it serves only reiser-based 
data) and 2.4.[1234]ac[...] on the client (without patch), it has been 
working because of the light load of the server.

Since 2.4.4ac11 something has been broken in the NFS for 2.4.5, so I ran into 
problems.

When this conclusion is right, the following combination should work for 
light load, as I usually have it:
unpatched 2.2.19 on the server (as it has been working with 2.2.4ac10) and 
NFS-patched 2.4.5. The test showed, that it is working.

The actual combination 2.4.5 on the server and on the client with the 
mentioned NFS-patch is the same situation as with the unpatched 2.2.19 on the 
server and the NFS-patch only on the 2.4.5-client.

I hope, that there will be soon a 2.4.5 knfsd-patch for my server, because 
this is the secure way! And I hope, the broken NFS on the client in the 
2.4.5-Kernels will be fixed soon - maybe in the next ac-patch?

Regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [2.4.5 and all ac-Patches] massive file corruption with reiser or NFS

2001-06-02 Thread Andreas Hartmann

Am Samstag,  2. Juni 2001 18:42 schrieben Sie:
> On Saturday, June 02, 2001 02:41:04 PM +0200 Andreas Hartmann
>
> <[EMAIL PROTECTED]> wrote:
> > Am Samstag,  2. Juni 2001 12:52 schrieb Rasmus Bøg Hansen:
> >> On Sat, 2 Jun 2001, Andreas Hartmann wrote:
> >> > I got massive file corruptions with the kernels mentioned in the
> >> > subject. I can reproduce it every time.
> >> >
> >> >> You cannot use NFS on reiserfs unless you apply the knfsd patch. Look
> >> >> at
> >>
> >> www.namesys.com.
> >>
> > > Thank you very much for your advice.
> > > I tested your suggestion and run the machine without NFS-mounted
> > > devices
> >
> > - it  seems to be working fine. > > Anyway - I'm wondering why I didn't
> > get any problem until 2.4.4ac10 with this  configuration without the
> > appropriate patch on the client or on the server?
>
> The problem only happens when the clients do an operation on a file that
> has gone out of cache on the server.  Under light load, this might happen
> very rarely.

The load didn't change. YOu can forget the load, it's very small. It's my 
private server and I'm doing always the same thing via NFS - compiling e.g. 
This has been working fine until 2.4.4.ac10, afterwards it has been broken.

>
> You only need the patch on the server.

My experiences today are others: I need the patch on both, the server and the 
client (both 2.4.5) to get it working. See the other mailing to Alan in the 
list.

Regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [2.4.5 and all ac-Patches] massive file corruption with reiser or NFS

2001-06-02 Thread Andreas Hartmann

Am Samstag,  2. Juni 2001 18:19 schrieben Sie:
> > I got massive file corruptions with the kernels mentioned in the subject.
> > I can reproduce it every time.
>
> Which other 2.4 trees have you tried ?

I had the following situations:

NFS server:
linux 2.2.19

NFS Client:
linux 2.4.[32]ac[...],
linux 2.4.4ac[1-...]
[1-10] have been working fine. Beginning with ac11, I got the problems I 
wrote. During this time, I never used any knfsd-patch.



The following is the combination, which seems to be working fine:

NFS Server:
linux 2.2.19 with knfsd-patch or linux 2.4.5 with the following knfsd-Patch 
from Gergely Tamas <[EMAIL PROTECTED]> (I got it from the mailinglist of 
reiser) (there is no patch for ac6):

--
--- linux-2.4.5/fs/inode.c.orig Fri May 25 14:15:38 2001
+++ linux-2.4.5/fs/inode.c  Wed May 30 12:17:29 2001
@@ -1044,6 +1044,8 @@
inode->i_state|=I_FREEING;
inodes_stat.nr_inodes--;
spin_unlock(&inode_lock);
+   if (inode->i_data.nrpages)
+   truncate_inode_pages(&inode->i_data, 
0);
clear_inode(inode);
}
}

--- linux-2.4.5-pre6/fs/nfs/dir.c.orig  Fri May 25 14:15:38 2001
+++ linux-2.4.5-pre6/fs/nfs/dir.c   Thu May 31 14:53:32 2001
@@ -753,6 +753,8 @@

nfs_zap_caches(dir);
error = NFS_PROTO(dir)->rmdir(dir, &dentry->d_name);
+   if (!error)
+   dentry->d_inode->i_nlink -= 2;

return error;
 }
@@ -870,6 +872,8 @@
error = NFS_PROTO(dir)->remove(dir, &dentry->d_name);
if (error < 0)
goto out;
+   if (inode)
+   inode->i_nlink--;

  out_delete:
/*


I patched the original 2.4.5-sources.

NFS Client:
linux 2.4.5 with knfsd-patch.

I need the patch on both the server and the client to get it working.

>
> Does booting with ide=nodma help ? [only in -ac]

I tested the following combination:

Server
2.2.19 without knfsd-Patch

Client
2.4.5ac6 without knfsd-Patch but ide=nodma

Result:
IO-Errors as I wrote in my initial posting.


Regards
Adnreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [2.4.5 and all ac-Patches] massive file corruption with reiser or NFS

2001-06-02 Thread Andreas Hartmann

Am Samstag,  2. Juni 2001 12:52 schrieb Rasmus Bøg Hansen:
> On Sat, 2 Jun 2001, Andreas Hartmann wrote:
> > I got massive file corruptions with the kernels mentioned in the subject.
> > I can reproduce it every time.
>
> You cannot use NFS on reiserfs unless you apply the knfsd patch. Look at
> www.namesys.com.

Thank you very much for your advice.

I tested your suggestion and run the machine without NFS-mounted devices - it 
seems to be working fine. 

Anyway - I'm wondering why I didn't get any problem until 2.4.4ac10 with this 
configuration without the appropriate patch on the client or on the server?

I'm a little bit confused now about this patch.
Do I need this knfsd-patch for the NFS-server or just for the clients or for 
both?


Thank you for your advice,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[2.4.5 and all ac-Patches] massive file corruption with reiser or NFS

2001-06-02 Thread Andreas Hartmann

Hallo all,

I got massive file corruptions with the kernels mentioned in the subject. I 
can reproduce it every time.

What did I do?

The kernel can't find files or directories which have been created seconds 
before. If I start configure of some program for example, the 
conftest-Directory can't be found again which is created during configure; 
datas can't be read in files (if it could open the file). I'm than getting an 
IO-error. No matter if it's local on reiser or remote via NFS. Always the 
same problem.

If you want to mount some partitions, like mount /boot, the /boot-Directory 
can't be found (IO-error). If you try it again and again, it will suddenly 
work :-(.

I tried to compile the 2.4.5ac6-Kernel under itself. It ended in massive 
errors while reading the sources.
I rebooted the machine (with a lot of errors while unmounting) with kernel 
2.2.19 and tried to compile the above mentioned kernel again. I got a lot of 
other errors -> 2.4.5 destroyed the files! I had to do a rm -R to get rid of 
the whole tree. After newcreation of the tree, the compiling under kernel 
2.2.19 worked fine.

I tried to compile the kernel with egcs 1.1.2 and gcc 2.95.3 - no matter. I 
tried to compile with or without APIC - no matter.

Do you have any suggestions how to compile the kernel to get it working and 
to locate the problem?


Some additional infos to my machine:
512 MB RAM
AMD Athlon 800
no overclocking


lspci -v
00:00.0 Host bridge: VIA Technologies, Inc. VT8371 [KX133] (rev 02)
Flags: bus master, medium devsel, latency 0
Memory at d600 (32-bit, prefetchable)
Capabilities: [a0] AGP version 2.0

00:01.0 PCI bridge: VIA Technologies, Inc. VT8371 [PCI-PCI Bridge] (prog-if 
00 [Normal decode])
Flags: bus master, 66Mhz, medium devsel, latency 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: c000-cfff
Memory behind bridge: d400-d5ff
Prefetchable memory behind bridge: d000-d3ff
Capabilities: [80] Power Management version 2

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
Subsystem: VIA Technologies, Inc. VT82C686/A PCI to ISA Bridge
Flags: bus master, stepping, medium devsel, latency 0

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) 
(prog-if 8a [Master SecP PriP])
Flags: bus master, medium devsel, latency 32
I/O ports at d000
Capabilities: [c0] Power Management version 2

00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) 
(prog-if 00 [UHCI]) Subsystem: Unknown device 0925:1234
Flags: bus master, medium devsel, latency 32, IRQ 9
I/O ports at d400
Capabilities: [80] Power Management version 2

00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 
30)
Flags: medium devsel
Capabilities: [68] Power Management version 2

00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo 
Super AC97/Audio] (rev 20)
Subsystem: VIA Technologies, Inc.: Unknown device 4511
Flags: medium devsel, IRQ 5
I/O ports at dc00
I/O ports at e000
I/O ports at e400
Capabilities: [c0] Power Management version 2

00:08.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 10/100 
Ethernet (rev
02)
Subsystem: Silicon Integrated Systems [SiS] SiS900 10/100 Ethernet 
Adapter
Flags: bus master, medium devsel, latency 32, IRQ 10
I/O ports at e800
Memory at d900 (32-bit, non-prefetchable)
Capabilities: [40] Power Management version 1
 
00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. RT8139
Flags: bus master, medium devsel, latency 32, IRQ 11
I/O ports at ec00
Memory at d9001000 (32-bit, non-prefetchable)
 
01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 PF (prog-if 
00 [VGA])
Subsystem: ATI Technologies Inc: Unknown device 0008
Flags: bus master, stepping, 66Mhz, medium devsel, latency 32, IRQ 10
Memory at d000 (32-bit, prefetchable)
I/O ports at c000
Memory at d500 (32-bit, non-prefetchable)
Capabilities: [50] AGP version 2.0
Capabilities: [5c] Power Management version 2





My .config-file (without the options wich have not set):

#
# Automatically generated by make menuconfig: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
CONFIG_UID16=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_MK7=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86

Re: [2.4.4ac11 - 2.4.5 and 2.4.5ac5] problems with stat continue (ln -s broken ...?)

2001-05-30 Thread Andreas Hartmann

Hallo all!

I know that I'm repeating myself - but the problem is still the same. It's 
impossible for me to use these kernels mentioned in the subject.

I think there is a parallelism to the mail "ln -s broken on 2.4.5" or am I 
the problem? Who can help me?

Regards,
Andreas Hartmann


> Hallo all,
>
> I get the mentioned error as often as longer the system is running. E.g.:
> $ ls kviewshell/.libs/libkmultipage.so
>
> The following is what strace say's:
>
> []
> 3795  lstat("kviewshell/.libs/libkmultipage.so", 0xb718) = -1 EIO
> (Input/output error)
> [...]
>
> The file really exists and is correct!
>
> Another example is:
> umount /boot
>
> That's what strace is saying:
> [...]
>
> 3762  open("/etc/mtab~3762", O_WRONLY|O_CREAT, 0) = 4
> 3762  close(4)  = 0
> 3762  link("/etc/mtab~3762", "/etc/mtab~") = -1 ENOENT (No such file or
> directory)
> 3762  unlink("/etc/mtab~3762")  = 0
> []
>
> I've got no problems with 2.4.4ac9 and ac10. The Problems start with ac11
> and can be found until the actual ac17.
>
>
> Versions:
> Linux athlon 2.4.4-ac16 #1 Don Mai 24 22:47:31 CEST 2001 i686 unknown
>
> Gnu C  2.95.3
> Gnu make   3.76.1
> binutils   2.9.5.0.12
> util-linux 2.10s
> mount  2.10s
> modutils   2.4.5
> e2fsprogs  1.19
> PPP2.4.0b1
> Linux C Library2.1.3
> Dynamic linker (ldd)   2.1.3
> Procps 2.0.2
> Net-tools  1.56
> Kbd0.96
> Sh-utils   2.0g
> Modules Loaded ext2 nfs lockd sunrpc 8139too sis900 serial
> parport_pc lp parport unix
>
> I'm using reiserfs, 3.5.x disk format.
>
>
> $ lspci -v
>
> 00:00.0 Host bridge: VIA Technologies, Inc. VT8371 [KX133] (rev 02)
>   Flags: bus master, medium devsel, latency 0
>   Memory at d600 (32-bit, prefetchable) [size=32M]
>   Capabilities: 
>
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8371 [PCI-PCI Bridge] (prog-if
> 00 [Normal decode])
>   Flags: bus master, 66Mhz, medium devsel, latency 0
>   Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
>   I/O behind bridge: c000-cfff
>   Memory behind bridge: d400-d5ff
>   Prefetchable memory behind bridge: d000-d3ff
>   Capabilities: 
>
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
>   Subsystem: VIA Technologies, Inc. VT82C686/A PCI to ISA Bridge
>   Flags: bus master, stepping, medium devsel, latency 0
>
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev
> 10) (prog-if 8a [Master SecP PriP])
>   Flags: bus master, medium devsel, latency 32
>   I/O ports at d000 [size=16]
>   Capabilities: 
>
> 00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10)
> (prog-if 00 [UHCI])
>   Subsystem: Unknown device 0925:1234
>   Flags: bus master, medium devsel, latency 32, IRQ 9
>   I/O ports at d400 [size=32]
>   Capabilities: 
>
> 00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
> (rev 30)
>   Flags: medium devsel, IRQ 9
>   Capabilities: 
>
> 00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686
> [Apollo Super AC97/Audio] (rev 20)
>   Subsystem: VIA Technologies, Inc.: Unknown device 4511
>   Flags: medium devsel, IRQ 5
>   I/O ports at dc00 [size=256]
>   I/O ports at e000 [size=4]
>   I/O ports at e400 [size=4]
>   Capabilities: 
>
> 00:08.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 10/100
> Ethernet (rev 02)
>   Subsystem: Silicon Integrated Systems [SiS] SiS900 10/100 Ethernet Adapter
>   Flags: bus master, medium devsel, latency 32, IRQ 10
>   I/O ports at e800 [size=256]
>   Memory at d900 (32-bit, non-prefetchable) [size=4K]
>   Expansion ROM at  [disabled] [size=128K]
>   Capabilities: 
>
> 00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev
> 10) Subsystem: Realtek Semiconductor Co., Ltd. RT8139
>   Flags: bus master, medium devsel, latency 32, IRQ 11
>   I/O ports at ec00 [size=256]
>   Memory at d9001000 (32-bit, non-prefetchable) [size=256]
>   Expansion ROM at  [disabled] [size=64K]
>
> 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 PF
> (prog-if 00 [VGA])
>   Subsystem: ATI Technologies Inc: Unknown device 0008
>   Flags: bus master, stepping, 66Mhz, medium devsel, latency 32, I

[2.4.4ac11-17] lasting problems with stat or link

2001-05-26 Thread Andreas Hartmann

Hallo all,

I get the mentioned error as often as longer the system is running. E.g.:
$ ls kviewshell/.libs/libkmultipage.so

The following is what strace say's:

[]
3795  lstat("kviewshell/.libs/libkmultipage.so", 0xb718) = -1 EIO
(Input/output error)
[...]

The file really exists and is correct!

Another example is:
umount /boot

That's what strace is saying:
[...]

3762  open("/etc/mtab~3762", O_WRONLY|O_CREAT, 0) = 4
3762  close(4)  = 0
3762  link("/etc/mtab~3762", "/etc/mtab~") = -1 ENOENT (No such file or
directory)
3762  unlink("/etc/mtab~3762")  = 0
[]

I've got no problems with 2.4.4ac9 and ac10. The Problems start with ac11 and 
can be found until the actual ac17.


Versions:
Linux athlon 2.4.4-ac16 #1 Don Mai 24 22:47:31 CEST 2001 i686 unknown

Gnu C  2.95.3
Gnu make   3.76.1
binutils   2.9.5.0.12
util-linux 2.10s
mount  2.10s
modutils   2.4.5
e2fsprogs  1.19
PPP2.4.0b1
Linux C Library2.1.3
Dynamic linker (ldd)   2.1.3
Procps 2.0.2
Net-tools  1.56
Kbd0.96
Sh-utils   2.0g
Modules Loaded ext2 nfs lockd sunrpc 8139too sis900 serial
parport_pc lp parport unix

I'm using reiserfs, 3.5.x disk format.


$ lspci -v

00:00.0 Host bridge: VIA Technologies, Inc. VT8371 [KX133] (rev 02)
Flags: bus master, medium devsel, latency 0
Memory at d600 (32-bit, prefetchable) [size=32M]
Capabilities: 

00:01.0 PCI bridge: VIA Technologies, Inc. VT8371 [PCI-PCI Bridge] (prog-if 
00 [Normal decode])
Flags: bus master, 66Mhz, medium devsel, latency 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: c000-cfff
Memory behind bridge: d400-d5ff
Prefetchable memory behind bridge: d000-d3ff
Capabilities: 

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
Subsystem: VIA Technologies, Inc. VT82C686/A PCI to ISA Bridge
Flags: bus master, stepping, medium devsel, latency 0

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) 
(prog-if 8a [Master SecP PriP])
Flags: bus master, medium devsel, latency 32
I/O ports at d000 [size=16]
Capabilities: 

00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) 
(prog-if 00 [UHCI])
Subsystem: Unknown device 0925:1234
Flags: bus master, medium devsel, latency 32, IRQ 9
I/O ports at d400 [size=32]
Capabilities: 

00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 
30)
Flags: medium devsel, IRQ 9
Capabilities: 

00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo 
Super AC97/Audio] (rev 20)
Subsystem: VIA Technologies, Inc.: Unknown device 4511
Flags: medium devsel, IRQ 5
I/O ports at dc00 [size=256]
I/O ports at e000 [size=4]
I/O ports at e400 [size=4]
Capabilities: 

00:08.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 10/100 
Ethernet (rev 02)
Subsystem: Silicon Integrated Systems [SiS] SiS900 10/100 Ethernet Adapter
Flags: bus master, medium devsel, latency 32, IRQ 10
I/O ports at e800 [size=256]
Memory at d900 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at  [disabled] [size=128K]
Capabilities: 

00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. RT8139
Flags: bus master, medium devsel, latency 32, IRQ 11
I/O ports at ec00 [size=256]
Memory at d9001000 (32-bit, non-prefetchable) [size=256]
Expansion ROM at  [disabled] [size=64K]

01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 PF (prog-if 
00 [VGA])
Subsystem: ATI Technologies Inc: Unknown device 0008
Flags: bus master, stepping, 66Mhz, medium devsel, latency 32, IRQ 10
Memory at d000 (32-bit, prefetchable) [size=64M]
I/O ports at c000 [size=256]
Memory at d500 (32-bit, non-prefetchable) [size=16K]
Expansion ROM at  [disabled] [size=128K]
Capabilities: 



Regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[2.4.4ac15/16] problems with stat or link

2001-05-24 Thread Andreas Hartmann
4)  = 0
3762  write(2, "Verkn\374pfung f\374r die Lock-Datei /"..., 160) = 160
3762  write(2, "\n", 1) = 1
3762  _exit(16) = ?


You can sometimes get rid of the problem, if you do a unmount and mount of 
the affected partition.
The problem is on local filesystems as well as on nfs-mounted filesystems 
(which resides on the server on reiserfs too).
If the problem is on a nfs-mounted FS, you can see, that in the original FS 
the file can be stat'ed without any problem (using telnet or ssh e.g.).

I've got no problems with 2.4.4ac9 (I didn't use the patches 10-14).

Versions:
Linux athlon 2.4.4-ac16 #1 Don Mai 24 22:47:31 CEST 2001 i686 unknown
 
Gnu C  2.95.3
Gnu make   3.76.1
binutils   2.9.5.0.12
util-linux 2.10s
mount  2.10s
modutils   2.4.5
e2fsprogs  1.19
PPP2.4.0b1
Linux C Library2.1.3
Dynamic linker (ldd)   2.1.3
Procps 2.0.2
Net-tools  1.56
Kbd0.96
Sh-utils   2.0g
Modules Loaded ext2 nfs lockd sunrpc 8139too sis900 serial parport_pc 
lp parport unix

I'm using reiserfs, 3.5.x disk format.


Regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [2.4.4ac4] Kernel crash while unmounting CD: cause and solution

2001-05-11 Thread Andreas Hartmann

Hallo all,

> [1.] One line summary of the problem:
>
> Kernel panic when trying to unmount a ide-scsi cdrom.

The problem was a not properly working cd-rw-device. I cleaned the optical 
lens - and the cd-rw-device is working like at the beginning of its days. 
With the same CD's which it doesn't want to burn and which causes the crash 
before!


Regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[2.4.4ac4] Kernel crash while unmounting CD

2001-05-10 Thread Andreas Hartmann
B2B+ ParErr- DEVSEL=medium >TAbort- 
SERR-  [disabled] [size=64K]

01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 PF (prog-if 
00 [VGA])
Subsystem: ATI Technologies Inc: Unknown device 0008
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping+ SERR- FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR-  [disabled] [size=128K]
Capabilities: [50] AGP version 2.0
Status: RQ=31 SBA+ 64bit- FW- Rate=x1,x2
Command: RQ=31 SBA+ AGP+ 64bit- FW- Rate=
Capabilities: [5c] Power Management version 2
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-



[7.6.] SCSI information (from /proc/scsi/scsi)

Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor:  Model: ATAPI CDROM.48X  Rev: 120Y
  Type:   CD-ROM   ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: PHILIPS  Model: CDD3610 CD-R/RW  Rev: 3.01
  Type:   CD-ROM   ANSI SCSI revision: 02

[7.7.] Other information that might be relevant to the problem
   (please look in /proc and include all information that you
   think to be relevant):

Please ask, if you want to know some more things!


[X.] Other notes, patches, fixes, workarounds:

dito.

Regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [2.4.3ac11] clock timer configuration lost - probably a VIA686a motherboard

2001-04-22 Thread Andreas Hartmann

Am Sonntag, 22. April 2001 14:46 schrieb Alan Cox:
> > I got a lot of messages while continuous writing / reading datas from one
> > a harddisk to another harddisk (both at 1. ide-channel) during backup
> > with rsync. Both harddisks use udma4. The data-stream was between 0,5
> > MB/s and 20MB/s.
> > I never got these messages before and after the backup finished I
> > couldn't see them anymore.
>
> Thy do trigger to easily. Im still investigating that

I just read in a german newsgroup from another user, having the same problem, 
but not with a VIA-chipset, but with a GA-5AX with Ali V Chipsatz +  iP 200 
MMX. Unfortunately he didn't tell, which kernel he's using and at which 
situation the message occurs.

Regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[2.4.3ac11] clock timer configuration lost - probably a VIA686a motherboard

2001-04-22 Thread Andreas Hartmann
 chip 
configuration.
Apr 22 11:38:03 athlon kernel: probable hardware bug: clock timer 
configuration lost - probably a VIA686a motherboard.
Apr 22 11:38:03 athlon kernel: probable hardware bug: restoring chip 
configuration.
Apr 22 11:38:32 athlon kernel: probable hardware bug: clock timer 
configuration lost - probably a VIA686a motherboard.
Apr 22 11:38:32 athlon kernel: probable hardware bug: restoring chip 
configuration.
Apr 22 11:38:52 athlon kernel: probable hardware bug: clock timer 
configuration lost - probably a VIA686a motherboard.
Apr 22 11:38:52 athlon kernel: probable hardware bug: restoring chip 
configuration.
Apr 22 11:44:40 athlon kernel: probable hardware bug: clock timer 
configuration lost - probably a VIA686a motherboard.
Apr 22 11:44:40 athlon kernel: probable hardware bug: restoring chip 
configuration.


I have a EP7KXA-Board with VIA-KX133-chipset.

cat /proc/pci
PCI devices found:
  Bus  0, device   0, function  0:
Host bridge: VIA Technologies, Inc. VT8371 [KX133] (rev 2).
  Prefetchable 32 bit memory at 0xd600 [0xd7ff].
  Bus  0, device   1, function  0:
PCI bridge: VIA Technologies, Inc. VT8371 [KX133 AGP]  (rev 0).
  Master Capable.  No bursts.  Min Gnt=12.
  Bus  0, device   7, function  0:
ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 34).
  Bus  0, device   7, function  1:
IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 16).
  Master Capable.  Latency=32.
  I/O at 0xd000 [0xd00f].
  Bus  0, device   7, function  2:
USB Controller: VIA Technologies, Inc. UHCI USB (rev 16).
  IRQ 9.
  Master Capable.  Latency=32.
  I/O at 0xd400 [0xd41f].
  Bus  0, device   7, function  4:
Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 48).
  IRQ 9.
  Bus  0, device   7, function  5:
Multimedia audio controller: VIA Technologies, Inc. AC97 Audio Controller 
(rev 32).
  IRQ 5.
  I/O at 0xdc00 [0xdcff].
  I/O at 0xe000 [0xe003].
  I/O at 0xe400 [0xe403].
  Bus  0, device   8, function  0:
Ethernet controller: Silicon Integrated Systems [SiS] SiS900 10/100 
Ethernet (rev 2).
  IRQ 10.
  Master Capable.  Latency=32.  Min Gnt=52.Max Lat=11.
  I/O at 0xe800 [0xe8ff].
  Non-prefetchable 32 bit memory at 0xd900 [0xd9000fff].
  Bus  0, device   9, function  0:
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 16).
  IRQ 11.
  Master Capable.  Latency=32.  Min Gnt=32.Max Lat=64.
  I/O at 0xec00 [0xecff].
  Non-prefetchable 32 bit memory at 0xd9001000 [0xd90010ff].
  Bus  1, device   0, function  0:
VGA compatible controller: ATI Technologies Inc Rage 128 PF (rev 0).
  IRQ 10.
  Master Capable.  Latency=32.  Min Gnt=8.
  Prefetchable 32 bit memory at 0xd000 [0xd3ff].
  I/O at 0xc000 [0xc0ff].
  Non-prefetchable 32 bit memory at 0xd500 [0xd5003fff]

Regards,
Andreas Hartmann

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.3ac11: [drm:r128_do_wait_for_idle] *ERROR* r128_do_wait_for_idle failed!

2001-04-21 Thread Andreas Hartmann

Hallo world!

I don't know, if I'm here at the right mailinglist, but I found another 
posting for this problem right here, so I decided to post this additional 
question here too.

I'm using kernel 2.4.3ac11 (or previous) and actual DRI-sources from 
dri.sourceforge.net with an ATI EXPERT2000 graphic accelerator (Rage 128). My 
motherboard has the VIA686a (VIA KX133)-chipset with an Athlon 800MHz 
processor.

The acceleration seems to be working fine so far but I often get the 
errormessage which I wrote in the subject.
Another message I often get, is the following:
[drm:r128_do_wait_for_fifo] *ERROR* r128_do_wait_for_fifo failed!

I found the posting from Anton Blanchard (subject: Re: uniteruptable sleep) 
here in this mailinglist and I checked his suggestion. I found, that the 
changes he proposed, have been already implemented in the CVS-source-code of 
DRI - but the problem still exists.

Is the problem sitting 50 cm before the screen - or is it a bug?


Thank you very much for your great work and your advice
Regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Problem with DMA or agpgart on VIA686a-boards consists with kernel 2.4.2?

2001-02-25 Thread Andreas Hartmann

Used hardware:
AMD Athlon 800
VIA KX133-Chipsatz with AGP

Chipset: ATI 264LT Pro (3D Rage LT Pro) (Port Probed)
Memory:  8192 Kbytes
RAMDAC:  ATI Mach64 integrated 15/16/24/32-bit DAC w/clock
 (with 6-bit wide lookup tables (or in 6-bit mode))
 (programmable for 6/8-bit wide lookup tables)
Attached graphics coprocessor:
Chipset: ATI Mach64
Memory:  8192 Kbytes


used software-versions:
- Kernel 2.3.x until 2.4.2
- Kernel 2.4test
- activated support for VIA82Cxxx and using DMA by default
- X 3.3.6
- agpgart-module
- glx from sourceforge.net with DMA and agpgart
- reiserfs
- DMA-mode of the hd (WDC WD205AA) is turned on with hdparm and runs 
   in UDMA4-mode.




Hallo all,

I already found in some 2.3-versions or in the 2.4test-versions the following 
problem:
The X-screen suddenly begins blinking and can't be stabilized without 
rebooting the machine (which did not freeze!). A restart of X-Server doesn't 
help. The problem persists.
Unfortunately I couldn't find any error-log. A hint maybe the output of the 
glx.log when started in "damaged" situation (negative benchmarks):

   119:dma buffer transfer speed:
 13698:DmaBenchmark 0xfde00 bytes, 0.010 sec: 97 mb/s
 15092:DmaBenchmark 0xfde00 bytes, -0.019 sec: -51 mb/s
 13647:DmaBenchmark 0xfde00 bytes, 0.010 sec: 103 mb/s

In not damaged situation, you can find something like this:
   119:dma buffer transfer speed:
 20007:DmaBenchmark 0xfde00 bytes, 0.016 sec: 61 mb/s
 10397:DmaBenchmark 0xfde00 bytes, 0.006 sec: 163 mb/s
 10296:DmaBenchmark 0xfde00 bytes, 0.006 sec: 164 mb/s

Some information to glx:
glx provides 3D-functions under X 3.3.6 with my graphic-chip and uses 
therefore DMA and agpgart. glx uses too a "little" file in the 
/tmp-directory, which resides on a reiserfs-partition.

There are no problems with kernel 2.2.x and the special IDE-patches from 
Andre Hedrick.

Now some interesting information perhaps:
The patches from Alan Cox 2.4.1ac9 or ac17 (I didn't test the others) are 
working fine. I couldn't find any problem with these patches.

I would be very appreciated if the related patches of Alan Cox could find 
there way in the official kernel!

Kind regards,
Andreas Hartmann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/