On 04/06/2015 07:59 PM, Steven Rostedt wrote:
>
Thanks for the comments.
> Hmm, why is it not allowed?
>
> If we just let it boost it, it will cut down on the code changes and
> checks that add to the hot paths.
>
There is a WARN_ON() at line 3150 in sched/core.c to warn against
boosting
On 04/06/2015 07:59 PM, Steven Rostedt wrote:
Thanks for the comments.
Hmm, why is it not allowed?
If we just let it boost it, it will cut down on the code changes and
checks that add to the hot paths.
There is a WARN_ON() at line 3150 in sched/core.c to warn against
boosting
This patch fixes the problem that the ownership of a mutex acquired by an
interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
This could result in an incorrect deadlock detection in function
rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
up
task value.
- Removed code to hadle the reserved interrupt handler's task value.
Thavatchai Makphaibulchoke (2):
rtmutex Real-Time Linux: Fixing kernel BUG at
kernel/locking/rtmutex.c:997!
kernel/locking/rtmutex.c: some code optimization
kernel/locking/rtmutex.c | 107
Adding the following code optimization,
- Reducing the number of cmpxchgs. Only call mark_rt_mutex_waiters() when
needed, waiters bit is not set.
- Reducing the hold time of wait_lock lock.
- Calling fixup_rt_mutex_waiters() only when needed.
- When unlocking rt_spin_lock in IRQ, alternate
task value.
- Removed code to hadle the reserved interrupt handler's task value.
Thavatchai Makphaibulchoke (2):
rtmutex Real-Time Linux: Fixing kernel BUG at
kernel/locking/rtmutex.c:997!
kernel/locking/rtmutex.c: some code optimization
kernel/locking/rtmutex.c | 107
Adding the following code optimization,
- Reducing the number of cmpxchgs. Only call mark_rt_mutex_waiters() when
needed, waiters bit is not set.
- Reducing the hold time of wait_lock lock.
- Calling fixup_rt_mutex_waiters() only when needed.
- When unlocking rt_spin_lock in IRQ, alternate
This patch fixes the problem that the ownership of a mutex acquired by an
interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
This could result in an incorrect deadlock detection in function
rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
up
On 03/22/2015 10:42 PM, Mike Galbraith wrote:
>> Why can't we just Let swapper be the owner when in irq with no dummy?
>>
Thanks Mike for the suggestion. That may also work. Unfortunately
somehow I'm still having a hung problem, which may be related to the
priority of the interrupt handler
On 03/22/2015 10:42 PM, Mike Galbraith wrote:
Why can't we just Let swapper be the owner when in irq with no dummy?
Thanks Mike for the suggestion. That may also work. Unfortunately
somehow I'm still having a hung problem, which may be related to the
priority of the interrupt handler task.
On 03/19/2015 10:26 AM, Steven Rostedt wrote:
> On Thu, 19 Mar 2015 09:17:09 +0100
> Mike Galbraith wrote:
>
>
>> (aw crap, let's go shopping)... so why is the one in timer.c ok?
>
> It's not. Sebastian, you said there were no other cases of rt_mutexes
> being taken in hard irq context. Looks
On 03/19/2015 10:26 AM, Steven Rostedt wrote:
On Thu, 19 Mar 2015 09:17:09 +0100
Mike Galbraith umgwanakikb...@gmail.com wrote:
(aw crap, let's go shopping)... so why is the one in timer.c ok?
It's not. Sebastian, you said there were no other cases of rt_mutexes
being taken in hard irq
On 02/23/2015 11:37 AM, Steven Rostedt wrote:
>
> OK, I believe I understand the issue. Perhaps it would be much better
> to create a fake task per CPU that we use when grabbing locks in
> interrupt mode. And make these have a priority of 0 (highest), since
> they can not be preempted, they do
On 02/23/2015 11:37 AM, Steven Rostedt wrote:
OK, I believe I understand the issue. Perhaps it would be much better
to create a fake task per CPU that we use when grabbing locks in
interrupt mode. And make these have a priority of 0 (highest), since
they can not be preempted, they do have
On 02/19/2015 09:53 PM, Steven Rostedt wrote:
> On Thu, 19 Feb 2015 18:31:05 -0700
> Thavatchai Makphaibulchoke wrote:
>
>> This patch fixes the problem that the ownership of a mutex acquired by an
>> interrupt handler(IH) gets incorrectly attributed to the interrupte
On 02/19/2015 09:53 PM, Steven Rostedt wrote:
On Thu, 19 Feb 2015 18:31:05 -0700
Thavatchai Makphaibulchoke t...@hp.com wrote:
This patch fixes the problem that the ownership of a mutex acquired by an
interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
*blink
Adding the following code optimization,
- Reducing the number of cmpxchgs. Only call mark_rt_mutex_waiters() when
needed, waiters bit is not set.
- Reducing the hold time of wait_lock lock.
- Calling fixup_rt_mutex_waiters() only when needed.
Signed-off-by: T. Makphaibulchoke
---
This patch series compose of 2 patches.
First patch, fixing kernel BUG at kernel/locking/rtmutex.c:997!
Second patch, some code optimation in kernel/locking/rtmutex.c
Thavatchai Makphaibulchoke (2):
rtmutex Real-Time Linux: Fixing kernel BUG at
kernel/locking/rtmutex.c:997!
kernel
This patch fixes the problem that the ownership of a mutex acquired by an
interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
This could result in an incorrect deadlock detection in function
rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
up
This patch series compose of 2 patches.
First patch, fixing kernel BUG at kernel/locking/rtmutex.c:997!
Second patch, some code optimation in kernel/locking/rtmutex.c
Thavatchai Makphaibulchoke (2):
rtmutex Real-Time Linux: Fixing kernel BUG at
kernel/locking/rtmutex.c:997!
kernel
This patch fixes the problem that the ownership of a mutex acquired by an
interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
This could result in an incorrect deadlock detection in function
rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
up
Adding the following code optimization,
- Reducing the number of cmpxchgs. Only call mark_rt_mutex_waiters() when
needed, waiters bit is not set.
- Reducing the hold time of wait_lock lock.
- Calling fixup_rt_mutex_waiters() only when needed.
Signed-off-by: T. Makphaibulchoke t...@hp.com
---
On 02/13/2015 12:19 PM, Paul Gortmaker wrote:
>
> I think there is more to this issue than just a lock conversion.
> Firstly, if we look at the existing -rt patches, we've got the old
> patch from ~2009 that is:
>
Thanks Paul for testing and reporting the problem.
Yes, looks like the issue
On 02/13/2015 12:19 PM, Paul Gortmaker wrote:
I think there is more to this issue than just a lock conversion.
Firstly, if we look at the existing -rt patches, we've got the old
patch from ~2009 that is:
Thanks Paul for testing and reporting the problem.
Yes, looks like the issue
Since memory cgroups can be called from a page fault handler as shown
by the stack dump here,
[12679.513255] BUG: scheduling while atomic: ssh/10621/0x0002
[12679.513305] Preemption disabled at:[]
mem_cgroup_charge_common+0x37/0x60
[12679.513305]
[12679.513322] Call Trace:
[12679.513331] []
Since memory cgroups can be called from a page fault handler as shown
by the stack dump here,
[12679.513255] BUG: scheduling while atomic: ssh/10621/0x0002
[12679.513305] Preemption disabled at:[811a20f7]
mem_cgroup_charge_common+0x37/0x60
[12679.513305]
[12679.513322] Call Trace:
Since memory cgroups can be called from a page fault handler as shown
by the stack dump here,
[12679.513255] BUG: scheduling while atomic: ssh/10621/0x0002
[12679.513305] Preemption disabled at:[]
mem_cgroup_charge_common+0x37/0x60
[12679.513305]
[12679.513322] Call Trace:
[12679.513331] []
Since memory cgroups can be called from a page fault handler as shown
by the stack dump here,
[12679.513255] BUG: scheduling while atomic: ssh/10621/0x0002
[12679.513305] Preemption disabled at:[811a20f7]
mem_cgroup_charge_common+0x37/0x60
[12679.513305]
[12679.513322] Call Trace:
On 06/24/2014 04:03 PM, Maciej W. Rozycki wrote:
> On Thu, 3 Apr 2014, Theodore Ts'o wrote:
>
>> fs/mbcache.c: doucple the locking of local from global data
>
> This change causes breakage:
>
> fs/built-in.o: In function `__mb_cache_entry_release':
> mbcache.c:(.text+0x56b3c): undefined
On 06/24/2014 04:03 PM, Maciej W. Rozycki wrote:
On Thu, 3 Apr 2014, Theodore Ts'o wrote:
fs/mbcache.c: doucple the locking of local from global data
This change causes breakage:
fs/built-in.o: In function `__mb_cache_entry_release':
mbcache.c:(.text+0x56b3c): undefined reference
On 05/26/2014 11:04 PM, Randy Dunlap wrote:
> On 05/26/2014 11:17 AM, werner wrote:
> @tmac: can mbcache.c #include and use ilog2(NR_BG_LOCKS)
> instead of using __builtin_log2(NR_BG_LOCKS) ?
> (ref. commit ID 1f3e55fe02d12213f87869768aa2b0bad3ba9a7d)
>
I don't see any problem with that,
On 05/26/2014 11:04 PM, Randy Dunlap wrote:
On 05/26/2014 11:17 AM, werner wrote:
@tmac: can mbcache.c #include linux/log2.h and use ilog2(NR_BG_LOCKS)
instead of using __builtin_log2(NR_BG_LOCKS) ?
(ref. commit ID 1f3e55fe02d12213f87869768aa2b0bad3ba9a7d)
I don't see any problem with
On 04/15/2014 11:25 AM, Jan Kara wrote:
> I have checked the source and I didn't find many places where i_mutex was
> not held. But maybe I'm wrong. That's why I wanted to see the patch where
> you are using i_mutex instead of hashed mutexes and which didn't perform
> good enough.
>
I've
On 04/14/2014 11:40 AM, Jan Kara wrote:
> Thanks for trying that out! Can you please send me a patch you have been
> testing? Because it doesn't quite make sense to me why using i_mutex should
> be worse than using hashed locks...
>
Thanks again for the comments.
Since i_mutex is also used for
On 04/14/2014 11:40 AM, Jan Kara wrote:
Thanks for trying that out! Can you please send me a patch you have been
testing? Because it doesn't quite make sense to me why using i_mutex should
be worse than using hashed locks...
Thanks again for the comments.
Since i_mutex is also used for
On 04/15/2014 11:25 AM, Jan Kara wrote:
I have checked the source and I didn't find many places where i_mutex was
not held. But maybe I'm wrong. That's why I wanted to see the patch where
you are using i_mutex instead of hashed mutexes and which didn't perform
good enough.
I've attached
On 04/02/2014 11:41 AM, Jan Kara wrote:
> Thanks for the patches and measurements! So I agree we contend a lot on
> orphan list changes in ext4. But what you do seems to be unnecessarily
> complicated and somewhat hiding the real substance of the patch. If I
> understand your patch correctly,
On 04/02/2014 11:41 AM, Jan Kara wrote:
Thanks for the patches and measurements! So I agree we contend a lot on
orphan list changes in ext4. But what you do seems to be unnecessarily
complicated and somewhat hiding the real substance of the patch. If I
understand your patch correctly, all it
On 04/02/2014 11:41 AM, Jan Kara wrote:
> Thanks for the patches and measurements! So I agree we contend a lot on
> orphan list changes in ext4. But what you do seems to be unnecessarily
> complicated and somewhat hiding the real substance of the patch. If I
> understand your patch correctly,
On 04/02/2014 11:41 AM, Jan Kara wrote:
Thanks for the patches and measurements! So I agree we contend a lot on
orphan list changes in ext4. But what you do seems to be unnecessarily
complicated and somewhat hiding the real substance of the patch. If I
understand your patch correctly, all it
On 01/24/2014 11:09 PM, Andreas Dilger wrote:
> I think the ext4 block groups are locked with the blockgroup_lock that has
> about the same number of locks as the number of cores, with a max of 128,
> IIRC. See blockgroup_lock.h.
>
> While there is some chance of contention, it is also
On 01/28/2014 02:09 PM, Andreas Dilger wrote:
> On Jan 28, 2014, at 5:26 AM, George Spelvin wrote:
>>> The third part of the patch further increases the scalablity of an ext4
>>> filesystem by having each ext4 fielsystem allocate and use its own private
>>> mbcache structure, instead of sharing a
On 01/28/2014 02:09 PM, Andreas Dilger wrote:
On Jan 28, 2014, at 5:26 AM, George Spelvin li...@horizon.com wrote:
The third part of the patch further increases the scalablity of an ext4
filesystem by having each ext4 fielsystem allocate and use its own private
mbcache structure, instead of
On 01/24/2014 11:09 PM, Andreas Dilger wrote:
I think the ext4 block groups are locked with the blockgroup_lock that has
about the same number of locks as the number of cores, with a max of 128,
IIRC. See blockgroup_lock.h.
While there is some chance of contention, it is also unlikely
On 01/24/2014 11:09 PM, Andreas Dilger wrote:
> I think the ext4 block groups are locked with the blockgroup_lock that has
> about the same number of locks as the number of cores, with a max of 128,
> IIRC. See blockgroup_lock.h.
>
> While there is some chance of contention, it is also
On 01/24/2014 11:09 PM, Andreas Dilger wrote:
I think the ext4 block groups are locked with the blockgroup_lock that has
about the same number of locks as the number of cores, with a max of 128,
IIRC. See blockgroup_lock.h.
While there is some chance of contention, it is also unlikely
On 01/28/2014 02:09 PM, Andreas Dilger wrote:
> On Jan 28, 2014, at 5:26 AM, George Spelvin wrote:
>>> The third part of the patch further increases the scalablity of an ext4
>>> filesystem by having each ext4 fielsystem allocate and use its own private
>>> mbcache structure, instead of sharing a
On 01/28/2014 02:09 PM, Andreas Dilger wrote:
On Jan 28, 2014, at 5:26 AM, George Spelvin li...@horizon.com wrote:
The third part of the patch further increases the scalablity of an ext4
filesystem by having each ext4 fielsystem allocate and use its own private
mbcache structure, instead of
On 01/24/2014 11:09 PM, Andreas Dilger wrote:
> I think the ext4 block groups are locked with the blockgroup_lock that has
> about the same number of locks as the number of cores, with a max of 128,
> IIRC. See blockgroup_lock.h.
>
> While there is some chance of contention, it is also
On 01/24/2014 11:09 PM, Andreas Dilger wrote:
I think the ext4 block groups are locked with the blockgroup_lock that has
about the same number of locks as the number of cores, with a max of 128,
IIRC. See blockgroup_lock.h.
While there is some chance of contention, it is also unlikely
On 01/24/2014 02:38 PM, Andi Kleen wrote:
> T Makphaibulchoke writes:
>
>> The patch consists of three parts.
>>
>> The first part changes the implementation of both the block and hash chains
>> of
>> an mb_cache from list_head to hlist_bl_head and also introduces new members,
>> including a
On 01/24/2014 02:38 PM, Andi Kleen wrote:
T Makphaibulchoke t...@hp.com writes:
The patch consists of three parts.
The first part changes the implementation of both the block and hash chains
of
an mb_cache from list_head to hlist_bl_head and also introduces new members,
including a
On 10/30/2013 08:42 AM, Theodore Ts'o wrote:
> I tried running xfstests with this patch, and it blew up on
> generic/020 test:
>
> generic/020 [10:21:50][ 105.170352] [ cut here ]
> [ 105.171683] kernel BUG at
>
On 10/30/2013 08:42 AM, Theodore Ts'o wrote:
I tried running xfstests with this patch, and it blew up on
generic/020 test:
generic/020 [10:21:50][ 105.170352] [ cut here ]
[ 105.171683] kernel BUG at
/usr/projects/linux/ext4/include/linux/bit_spinlock.h:76!
[
On 10/03/2013 06:28 PM, Andreas Dilger wrote:
>
> It would also be possible to have a completely contention-free orphan
> inode list by only generating the on-disk orphan linked list in a
> pre-commit callback hook from an efficient in-memory list. That would
> allow the common "add to orphan
On 10/03/2013 06:37 PM, Andreas Dilger wrote:
> On 2013-10-02, at 9:36 AM, T Makphaibulchoke wrote:
>
> What do these additional fields do to the size of struct ext4_inode_info?
> I recall that Ted did a bunch of work to shrink this enough to fit nicely
> into a slab, and it would be a shame to
On 10/03/2013 06:41 PM, Andreas Dilger wrote:
>> +struct inode *next_inode;
>
> Stack space in the kernel is not so abundant that all (or any?) of these
> should get their own local variable.
>
>>
>> -if (!EXT4_SB(sb)->s_journal)
>
> Same here.
>
>
> Cheers, Andreas
Thanks Andreas
On 10/02/2013 08:05 PM, Zheng Liu wrote:
> Hello,
>
>> -if (!EXT4_SB(sb)->s_journal)
>> +if (ext4_sb->s_journal)
>
> typo: !ext4_sb->s_journal
> I am not sure whether or not this will impact the result because when
> journal is enabled the inode will not be added
On 10/02/2013 08:05 PM, Zheng Liu wrote:
Hello,
-if (!EXT4_SB(sb)-s_journal)
+if (ext4_sb-s_journal)
typo: !ext4_sb-s_journal
I am not sure whether or not this will impact the result because when
journal is enabled the inode will not be added into orphan
On 10/03/2013 06:41 PM, Andreas Dilger wrote:
+struct inode *next_inode;
Stack space in the kernel is not so abundant that all (or any?) of these
should get their own local variable.
-if (!EXT4_SB(sb)-s_journal)
Same here.
Cheers, Andreas
Thanks Andreas for the comments.
On 10/03/2013 06:37 PM, Andreas Dilger wrote:
On 2013-10-02, at 9:36 AM, T Makphaibulchoke wrote:
What do these additional fields do to the size of struct ext4_inode_info?
I recall that Ted did a bunch of work to shrink this enough to fit nicely
into a slab, and it would be a shame to
On 10/03/2013 06:28 PM, Andreas Dilger wrote:
It would also be possible to have a completely contention-free orphan
inode list by only generating the on-disk orphan linked list in a
pre-commit callback hook from an efficient in-memory list. That would
allow the common add to orphan list; do
On 09/11/2013 09:25 PM, Theodore Ts'o wrote:
> On Wed, Sep 11, 2013 at 03:48:57PM -0500, Eric Sandeen wrote:
>>
>> So at this point I think it's up to Mak to figure out why on his system,
>> aim7 is triggering mbcache codepaths.
>>
>
> Yes, the next thing is to see if on his systems, whether or
On 09/11/2013 09:25 PM, Theodore Ts'o wrote:
On Wed, Sep 11, 2013 at 03:48:57PM -0500, Eric Sandeen wrote:
So at this point I think it's up to Mak to figure out why on his system,
aim7 is triggering mbcache codepaths.
Yes, the next thing is to see if on his systems, whether or not he's
On 09/10/2013 09:02 PM, Theodore Ts'o wrote:
> On Tue, Sep 10, 2013 at 02:47:33PM -0600, Andreas Dilger wrote:
>> I agree that SELinux is enabled on enterprise distributions by default,
>> but I'm also interested to know how much overhead this imposes. I would
>> expect that writing large
On 09/10/2013 09:02 PM, Theodore Ts'o wrote:
On Tue, Sep 10, 2013 at 02:47:33PM -0600, Andreas Dilger wrote:
I agree that SELinux is enabled on enterprise distributions by default,
but I'm also interested to know how much overhead this imposes. I would
expect that writing large external
On 09/06/2013 05:10 AM, Andreas Dilger wrote:
> On 2013-09-05, at 3:49 AM, Thavatchai Makphaibulchoke wrote:
>> No, I did not do anything special, including changing an inode's size. I
>> just used the profile data, which indicated mb_cache module as one of the
>> bottlene
On 09/06/2013 05:10 AM, Andreas Dilger wrote:
On 2013-09-05, at 3:49 AM, Thavatchai Makphaibulchoke wrote:
No, I did not do anything special, including changing an inode's size. I
just used the profile data, which indicated mb_cache module as one of the
bottleneck. Please see below for perf
On 09/04/2013 08:00 PM, Andreas Dilger wrote:
>
> In the past, I've raised the question of whether mbcache is even
> useful on real-world systems. Essentially, this is providing a
> "deduplication" service for ext2/3/4 xattr blocks that are identical.
> The question is how often this is actually
On 09/05/2013 02:35 AM, Theodore Ts'o wrote:
> How did you gather these results? The mbcache is only used if you are
> using extended attributes, and only if the extended attributes don't
> fit in the inode's extra space.
>
> I checked aim7, and it doesn't do any extended attribute operations.
>
On 09/05/2013 02:35 AM, Theodore Ts'o wrote:
How did you gather these results? The mbcache is only used if you are
using extended attributes, and only if the extended attributes don't
fit in the inode's extra space.
I checked aim7, and it doesn't do any extended attribute operations.
So
On 09/04/2013 08:00 PM, Andreas Dilger wrote:
In the past, I've raised the question of whether mbcache is even
useful on real-world systems. Essentially, this is providing a
deduplication service for ext2/3/4 xattr blocks that are identical.
The question is how often this is actually the
On 09/04/2013 08:00 PM, Andreas Dilger wrote:
>
> In the past, I've raised the question of whether mbcache is even
> useful on real-world systems. Essentially, this is providing a
> "deduplication" service for ext2/3/4 xattr blocks that are identical.
> The question is how often this is
On 09/04/2013 08:00 PM, Andreas Dilger wrote:
In the past, I've raised the question of whether mbcache is even
useful on real-world systems. Essentially, this is providing a
deduplication service for ext2/3/4 xattr blocks that are identical.
The question is how often this is actually the
Thanks for the comments.
On 08/22/2013 04:53 PM, Linus Torvalds wrote:
>
> Please don't do these ugly and pointless preprocessor macro expanders
> that hide what the actual operation is.
>
> The DEBUG case seems to be just for your own testing anyway, so even
> that shouldn't exist in the
Thanks for the comments.
On 08/22/2013 04:53 PM, Linus Torvalds wrote:
Please don't do these ugly and pointless preprocessor macro expanders
that hide what the actual operation is.
The DEBUG case seems to be just for your own testing anyway, so even
that shouldn't exist in the merged
On 07/18/2013 04:33 PM, Theodore Ts'o wrote:
> On Thu, Jul 18, 2013 at 12:30:24PM -0400, Theodore Ts'o wrote:
>> On Wed, Jul 17, 2013 at 06:55:10PM -0600, T Makphaibulchoke wrote:
>>> This patch intends to improve the scalability of an ext4 filesystem by
>>> introducing higher degree of
On 07/18/2013 04:33 PM, Theodore Ts'o wrote:
On Thu, Jul 18, 2013 at 12:30:24PM -0400, Theodore Ts'o wrote:
On Wed, Jul 17, 2013 at 06:55:10PM -0600, T Makphaibulchoke wrote:
This patch intends to improve the scalability of an ext4 filesystem by
introducing higher degree of parallelism to the
Thank you both for the comments.
Sounds like a better solution is to allow accesses to only I/O regions
presented in the EFI memory map for physical addresses below 1 MB.
Do we need to worry about the X checksum in the first MB on an EFI system?
Thanks,
Mak.
On 10/02/2012 11:15 PM, Matthew
Thank you both for the comments.
Sounds like a better solution is to allow accesses to only I/O regions
presented in the EFI memory map for physical addresses below 1 MB.
Do we need to worry about the X checksum in the first MB on an EFI system?
Thanks,
Mak.
On 10/02/2012 11:15 PM, Matthew
On 08/27/2012 10:30 PM, Ram Pai wrote:
> For example:
> if the region requested is 1 to 100, but 20-30 is already reserved, than
> the earlier behavior would reserve 1-20 and 30-100. With your
> patch, it will just reserve 1-20.
>
> RP
>
Thanks RP for pointing the problem. Sorry for missing
On 08/27/2012 10:30 PM, Ram Pai wrote:
For example:
if the region requested is 1 to 100, but 20-30 is already reserved, than
the earlier behavior would reserve 1-20 and 30-100. With your
patch, it will just reserve 1-20.
RP
Thanks RP for pointing the problem. Sorry for missing part of
82 matches
Mail list logo