/ \
S N --> Sl P
/ \ / \
sl (sr)(sr) N
This is actually right rotation at "(p)" + color flips, not left
rotation + color flips.
Signed-off-by: Jie Chen <fykc...@gmail.com>
---
lib/rbtree.c | 23 +++
1 file ch
t; Sl P
/ \ / \
sl (sr)(sr) N
This is actually right rotation at "(p)" + color flips, not left
rotation + color flips.
Signed-off-by: Jie Chen
---
lib/rbtree.c | 23 +++
1 file changed, 19 insertions(+), 4 deletions(-)
diff
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
The following is pthread_sync output for 2.6.21.7-cfs-v24 #1 SMP
kernel.
2 threads:
PARALLEL time = 11.106580 microseconds +/- 0.002460
PARALLEL overhead =0.617590 microseconds +/- 0.003409
Output for Kernel 2.6.
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
Hi, Ingo:
I guess it is a good news. I did patch 2.6.21.7 kernel using your cfs
patch. The results of pthread_sync is the same as the non-patched
2.6.21 kernel. This means the performance of is not related to the
sch
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
and then you use this in the measurement loop:
for (k=0; k<=OUTERREPS; k++){
start = getclock();
for (j=0; jthe problem is, this does not take the overhead of gettimeofday into
account - which overhead can easily
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
and then you use this in the measurement loop:
for (k=0; k=OUTERREPS; k++){
start = getclock();
for (j=0; jinnerreps; j++){
#ifdef _QMT_PUBLIC
delay((void *)0, 0);
#else
delay(0, 0, 0, (void *)0);
#endif
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
Hi, Ingo:
I guess it is a good news. I did patch 2.6.21.7 kernel using your cfs
patch. The results of pthread_sync is the same as the non-patched
2.6.21 kernel. This means the performance of is not related to the
scheduler
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
The following is pthread_sync output for 2.6.21.7-cfs-v24 #1 SMP
kernel.
2 threads:
PARALLEL time = 11.106580 microseconds +/- 0.002460
PARALLEL overhead =0.617590 microseconds +/- 0.003409
Output for Kernel 2.6.24-rc4
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
I did patch the header file and recompiled the kernel. I observed no
difference (two threads overhead stays too high). Thank you.
ok, i think i found it. You do this in your qmt/pthread_sync.c
test-code:
double get_time_o
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
I did patch the header file and recompiled the kernel. I observed no
difference (two threads overhead stays too high). Thank you.
ok, i think i found it. You do this in your qmt/pthread_sync.c
test-code:
double get_time_of_day_
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
not "BARRIER time". I've re-read the discussion and found no hint
about how to build and run a barrier test. Either i missed it or it's
so obvious to you that you didnt mention it :-)
Ingo
Hi, Ingo:
Did
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
not BARRIER time. I've re-read the discussion and found no hint
about how to build and run a barrier test. Either i missed it or it's
so obvious to you that you didnt mention it :-)
Ingo
Hi, Ingo:
Did you do configure --enable
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
sorry to be dense, but could you give me instructions how i could
remove the affinity mask and test the "barrier overhead" myself? I
have built "pthread_sync" and it outputs numbers for me - which one
wou
Peter Zijlstra wrote:
On Wed, 2007-11-21 at 15:34 -0500, Jie Chen wrote:
It is clearly that the synchronization overhead increases as the number
of threads increases in the kernel 2.6.21. But the synchronization
overhead actually decreases as the number of threads increases in the
kernel
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
I just disabled the affinity mask and reran the test. There were no
significant changes for two threads (barrier overhead is around 9
microseconds). As for 8 threads, the barrier overhead actually drops a
little, which is good.
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
Since I am using affinity flag to bind each thread to a different
core, the synchronization overhead should increases as the number of
cores/threads increases. But what we observed in the new kernel is the
opposite. The b
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
the moment you saturate the system a bit more, the numbers should
improve even with such a ping-pong test.
You are right. If I manually do load balance (bind unrelated processes
on the other cores), my test code perform a
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
I just ran the same test on two 2.6.24-rc4 kernels: one with
CONFIG_FAIR_GROUP_SCHED on and the other with CONFIG_FAIR_GROUP_SCHED
off. The odd behavior I described in my previous e-mails were still
there for both kernels. Let m
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
Simon Holm Th??gersen wrote:
ons, 21 11 2007 kl. 20:52 -0500, skrev Jie Chen:
There is a backport of the CFS scheduler to 2.6.21, see
http://lkml.org/lkml/2007/11/19/127
Hi, Simon:
I will try that after the thanksgiving h
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
Simon Holm Th??gersen wrote:
ons, 21 11 2007 kl. 20:52 -0500, skrev Jie Chen:
There is a backport of the CFS scheduler to 2.6.21, see
http://lkml.org/lkml/2007/11/19/127
Hi, Simon:
I will try that after the thanksgiving holiday to find
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
the moment you saturate the system a bit more, the numbers should
improve even with such a ping-pong test.
You are right. If I manually do load balance (bind unrelated processes
on the other cores), my test code perform as well as it did
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
Since I am using affinity flag to bind each thread to a different
core, the synchronization overhead should increases as the number of
cores/threads increases. But what we observed in the new kernel is the
opposite. The barrier overhead
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
sorry to be dense, but could you give me instructions how i could
remove the affinity mask and test the barrier overhead myself? I
have built pthread_sync and it outputs numbers for me - which one
would be the barrier overhead
Peter Zijlstra wrote:
On Wed, 2007-11-21 at 15:34 -0500, Jie Chen wrote:
It is clearly that the synchronization overhead increases as the number
of threads increases in the kernel 2.6.21. But the synchronization
overhead actually decreases as the number of threads increases in the
kernel
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
I just disabled the affinity mask and reran the test. There were no
significant changes for two threads (barrier overhead is around 9
microseconds). As for 8 threads, the barrier overhead actually drops a
little, which is good. Let me
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
I just ran the same test on two 2.6.24-rc4 kernels: one with
CONFIG_FAIR_GROUP_SCHED on and the other with CONFIG_FAIR_GROUP_SCHED
off. The odd behavior I described in my previous e-mails were still
there for both kernels. Let me know
Ingo Molnar wrote:
* Jie Chen <[EMAIL PROTECTED]> wrote:
Simon Holm Th??gersen wrote:
ons, 21 11 2007 kl. 20:52 -0500, skrev Jie Chen:
There is a backport of the CFS scheduler to 2.6.21, see
http://lkml.org/lkml/2007/11/19/127
Hi, Simon:
I will try that after the thanksgiving h
Ingo Molnar wrote:
* Jie Chen [EMAIL PROTECTED] wrote:
Simon Holm Th??gersen wrote:
ons, 21 11 2007 kl. 20:52 -0500, skrev Jie Chen:
There is a backport of the CFS scheduler to 2.6.21, see
http://lkml.org/lkml/2007/11/19/127
Hi, Simon:
I will try that after the thanksgiving holiday to find
Simon Holm Thøgersen wrote:
ons, 21 11 2007 kl. 20:52 -0500, skrev Jie Chen:
There is a backport of the CFS scheduler to 2.6.21, see
http://lkml.org/lkml/2007/11/19/127
Hi, Simon:
I will try that after the thanksgiving holiday to find out whether the
odd behavior will show up using 2.6.21
Eric Dumazet wrote:
Jie Chen a écrit :
Hi, there:
We have a simple pthread program that measures the synchronization
overheads for various synchronization mechanisms such as spin locks,
barriers (the barrier is implemented using queue-based barrier
algorithm) and so on. We have dual
.
###
Jie Chen
Scientific Computing Group
Thomas Jefferson National Accelerator Facility
12000, Jefferson Ave.
Newport News, VA 23606
(757)269-5046 (office) (757)269-6248 (fax)
[EMAIL PROTECTED]
###
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
.
###
Jie Chen
Scientific Computing Group
Thomas Jefferson National Accelerator Facility
12000, Jefferson Ave.
Newport News, VA 23606
(757)269-5046 (office) (757)269-6248 (fax)
[EMAIL PROTECTED]
###
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
Eric Dumazet wrote:
Jie Chen a écrit :
Hi, there:
We have a simple pthread program that measures the synchronization
overheads for various synchronization mechanisms such as spin locks,
barriers (the barrier is implemented using queue-based barrier
algorithm) and so on. We have dual
Simon Holm Thøgersen wrote:
ons, 21 11 2007 kl. 20:52 -0500, skrev Jie Chen:
There is a backport of the CFS scheduler to 2.6.21, see
http://lkml.org/lkml/2007/11/19/127
Hi, Simon:
I will try that after the thanksgiving holiday to find out whether the
odd behavior will show up using 2.6.21
34 matches
Mail list logo