Re: High sleep_on_page latency

2012-01-15 Thread Prateek Sharma
On Mon, Jan 16, 2012 at 10:41 AM, Mulyadi Santosa
 wrote:
> Hi Prateek...
>
> On Mon, Jan 16, 2012 at 11:48, Prateek Sharma  wrote:
>> The latencytop output for the process doing the IO (qemu) is below:
>>
>> Process qemu-system-x86 (6239)             Total: 31363.7 msec
>> [sleep_on_page]                                   1966.5 msec         81.4 %
>> Waiting for event (select)                          5.0 msec          7.8 %
>> [kvm_vcpu_block]                                    5.0 msec          8.5 %
>> synchronous write                                   1.7 msec          0.0 %
>> Userspace lock contention                           1.5 msec          2.3 %
>
> Which qemu version do you use now? AFAIK certain new qemu versions
> already uses iothreads by default. That should reduce I/O latency...
> AFAIK too iothreads is not enabled by default...but for qemu 1.0 above
> it's enabled by default. Are you compiling from source?
>

I am using qemu-kvm 0.15, with KVM.

> Regarding the function sleep_on_page(). It's in turn io_schedule().
> And here's the comment above the function declaration of
> io_schedule():
> /*
> * This task is about to go to sleep on IO. Increment rq->nr_iowait so
> * that process accounting knows that this is a task in IO wait state.
> */
>
> You can confirm it by yourself in:
> http://lxr.linux.no/#linux+v3.2.1/kernel/sched.c#L5872
>

   My initial understanding was that sleep_on_page is 'called' by
__lock_page, which is usually called by file_read .
So, i assumed that sleep_on_page is for the page-lock contention.
My primary confusion is with the interpretation of latencytop output.
2 seconds even for I/O to complete seems awfully long.


> Hope it helps
>
> --
> regards,
>
> Mulyadi Santosa
> Freelance Linux trainer and consultant
>
> blog: the-hydra.blogspot.com
> training: mulyaditraining.blogspot.com

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: two netcard configed in one subnet problem

2012-01-15 Thread Mulyadi Santosa
Hi

On Mon, Jan 16, 2012 at 09:16, hu jun  wrote:
> then I unplug eth1 , ethtool eth1 show "Link detected: no",
>
> now on B :ping 192.168.160.11 and 192.168.160.21, both are ok, too.

Maybe the thing that happen is that when you ping eth1 of A from B,
the flow is: B send ICMP--> A' eth0 receives it--> forward it to eth1

Eth1, in my humble opinion, is not really answering "by hardware", but
just it's network stack. Since you made eth0 and eth1 under the same
broadcast address, the effect is like you were doing bonding between
eth0+eth1 in A.

Just curious, are you enabling IP forwarding in A? Or bridging or something?


-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: High sleep_on_page latency

2012-01-15 Thread Mulyadi Santosa
Hi Prateek...

On Mon, Jan 16, 2012 at 11:48, Prateek Sharma  wrote:
> The latencytop output for the process doing the IO (qemu) is below:
>
> Process qemu-system-x86 (6239)             Total: 31363.7 msec
> [sleep_on_page]                                   1966.5 msec         81.4 %
> Waiting for event (select)                          5.0 msec          7.8 %
> [kvm_vcpu_block]                                    5.0 msec          8.5 %
> synchronous write                                   1.7 msec          0.0 %
> Userspace lock contention                           1.5 msec          2.3 %

Which qemu version do you use now? AFAIK certain new qemu versions
already uses iothreads by default. That should reduce I/O latency...
AFAIK too iothreads is not enabled by default...but for qemu 1.0 above
it's enabled by default. Are you compiling from source?

Regarding the function sleep_on_page(). It's in turn io_schedule().
And here's the comment above the function declaration of
io_schedule():
/*
* This task is about to go to sleep on IO. Increment rq->nr_iowait so
* that process accounting knows that this is a task in IO wait state.
*/

You can confirm it by yourself in:
http://lxr.linux.no/#linux+v3.2.1/kernel/sched.c#L5872

Hope it helps

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Current Thread mapping

2012-01-15 Thread Mulyadi Santosa
Hi Santosh :)

On Sun, Jan 15, 2012 at 16:32, SaNtosh kuLkarni
 wrote:
> HI everyone just wanted to know whats the current implementation of user||
> kernel space thread mapping ...is it 1:1 or does it depend on the needs ?
> For example say if i have 12k user space thread running ,,,how many kernel
> space thread would be managing them... as far as i know there is 1:1 mappin

For NPTL based threading (Native Posix Threading Library), glibc and
linux kernel together maintain 1:1 mapping, that is one user mode
thread represented handled by one kernel mode thread.

Before NPTL, IIRC it's M:1, that is more than one user mode threads
are represented by one kernel mode thread. In user space, there's
somekind of "coordinator" or "master" that does cooperative switching
between threads. In NPTL, all threads are under the management of
Linux process scheduler directly.

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


High sleep_on_page latency

2012-01-15 Thread Prateek Sharma
Hello,
  I am measuring system latency using latencytop. I am using kernel
version 3.0 with some minor changes.
The latencytop output for the process doing the IO (qemu) is below:

Process qemu-system-x86 (6239) Total: 31363.7 msec
[sleep_on_page]   1966.5 msec 81.4 %
Waiting for event (select)  5.0 msec  7.8 %
[kvm_vcpu_block]5.0 msec  8.5 %
synchronous write   1.7 msec  0.0 %
Userspace lock contention   1.5 msec  2.3 %

Is the 2000 msec sleep_on_page delays normal ? I would also like to
know what sleep_on_page exactly corresponds to. My understanding is
that it is the time spent waiting for grabbing a lock on a page. Is
this correct ?

Thanks.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Re: which local FS supports concurrent direct IO write?

2012-01-15 Thread Zheng Da
On Sun, Jan 15, 2012 at 4:22 PM, Raghavendra D Prabhu <
raghu.prabh...@gmail.com> wrote:

> Hi Zheng,
>
> Interesting analysis.
> * On Sun, Jan 15, 2012 at 03:17:12PM -0500, Zheng Da <
> zhengda1...@gmail.com> wrote:
>
>> Thanks. I was reading the code of kernel 3.0. XFS starts to support
>> concurrent direct IO since kernel 3.1.5.
>> But concurrent direct IO write still doesn't work well in kernel 3.2.
>>
> From what I have heard it has supported it from sometime back. I think you
> may need to ask in xfs general ML about this.

I didn't know this ML. I'll ask them for help.

>
>  I wrote a test program that accesses a 4G file randomly (read and write),
>> and
>> I ran it with 8 threads and the machine has 8 cores. It turns out that
>> only
>> 1 core is running. I'm pretty sure xfs_rw_ilock is locked
>> with XFS_IOLOCK_SHARED in xfs_file_dio_aio_write.
>>
>> lockstat shows me that there is a lot of wait time in ip->i_lock. It seems
>> the lock is locked exclusively.
>>  &(&ip->i_lock)->mr_lock-W: 31568  36170
>> 0.24   20048.25 7589157.99 1301543146848
>> 0.00 217.70 1238310.72
>>  &(&ip->i_lock)->mr_lock-R: 11251  11886
>> 0.24   20043.01 2895595.18  46671 526309
>> 0.00  63.80  264097.96
>>  -
>>&(&ip->i_lock)->mr_lock  36170
>> [] xfs_ilock+0xb2/0x110 [xfs]
>>&(&ip->i_lock)->mr_lock  11886
>> [] xfs_ilock+0xea/0x110 [xfs]
>>  -
>>&(&ip->i_lock)->mr_lock  38555
>> [] xfs_ilock+0xb2/0x110 [xfs]
>>&(&ip->i_lock)->mr_lock   9501
>> [] xfs_ilock+0xea/0x110 [xfs]
>>
>> Then I used systemtap to instrument xfs_ilock and find there are at least
>> 3
>> functions that lock ip->i_lock exclusively during write.
>>
>
> From what I saw in xfs_file_dio_aio_write code, it uses EXCL only if there
> is unaligned IO   or there are cached pages to be invalidated after shared
> lock is obtained *but* it demotes that lock to SHARED just before
> generic_file_direct_write.

 Actually, there are two locks for an inode, i_lock and i_iolock. systemtap
shows me that i_iolock is already locked to SHARED, but i_lock is locked
exclusively somewhere else. Even though I don't think I have found the
right spot that hurts concurrency so badly.

Thanks,
Da
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Re: which local FS supports concurrent direct IO write?

2012-01-15 Thread Raghavendra D Prabhu

Hi Zheng,

Interesting analysis. 


* On Sun, Jan 15, 2012 at 03:17:12PM -0500, Zheng Da  
wrote:

Thanks. I was reading the code of kernel 3.0. XFS starts to support
concurrent direct IO since kernel 3.1.5.
But concurrent direct IO write still doesn't work well in kernel 3.2. 
From what I have heard it has supported it from sometime back. I 
think you may need to ask in xfs general ML about this.




I wrote a test program that accesses a 4G file randomly (read and write), and
I ran it with 8 threads and the machine has 8 cores. It turns out that only
1 core is running. I'm pretty sure xfs_rw_ilock is locked
with XFS_IOLOCK_SHARED in xfs_file_dio_aio_write.

lockstat shows me that there is a lot of wait time in ip->i_lock. It seems
the lock is locked exclusively.
  &(&ip->i_lock)->mr_lock-W: 31568  36170
 0.24   20048.25 7589157.99 1301543146848
 0.00 217.70 1238310.72
  &(&ip->i_lock)->mr_lock-R: 11251  11886
 0.24   20043.01 2895595.18  46671 526309
 0.00  63.80  264097.96
  -
&(&ip->i_lock)->mr_lock  36170
[] xfs_ilock+0xb2/0x110 [xfs]
&(&ip->i_lock)->mr_lock  11886
[] xfs_ilock+0xea/0x110 [xfs]
  -
&(&ip->i_lock)->mr_lock  38555
[] xfs_ilock+0xb2/0x110 [xfs]
&(&ip->i_lock)->mr_lock   9501
[] xfs_ilock+0xea/0x110 [xfs]

Then I used systemtap to instrument xfs_ilock and find there are at least 3
functions that lock ip->i_lock exclusively during write.


From what I saw in xfs_file_dio_aio_write code, it uses EXCL only 
if there is unaligned IO   or there are cached pages to be 
invalidated after shared lock is obtained *but* it demotes that 
lock to SHARED just before generic_file_direct_write. 


Is there any popular FS that supports concurrent direct IO well?

Thanks,
Da

On Sat, Jan 14, 2012 at 6:45 AM, Raghavendra D Prabhu <
raghu.prabh...@gmail.com> wrote:


Hi Zheng,




* On Fri, Jan 13, 2012 at 04:41:16PM -0500, Zheng Da <
zhengda1...@gmail.com> wrote:



Hello,



I'm looking for a FS in Linux that supports concurrent direct IO write.
ext4 supports concurrent direct IO read if we mount it with
dioread_nolock,
but doesn't support concurrent writes. XFS doesn't support concurrent
direct IO at all. It locks the inode exclusive if it's direct IO. I tried
btrfs, and it seems it doesn't support concurrent direct IO either though
I
haven't looked into its code.
Is there a local FS that support concurrent direct IO write? It seems NFS
supports it (
http://kevinclosson.wordpress.**com/2011/08/12/file-systems-**
for-a-database-choose-one-**that-couples-direct-io-and-**
concurrent-io-whats-this-have-**to-do-with-nfs-harken-back-5-**
2-years-to-find-out/
),
but I'm looking for local FS.



Thanks,
Da




 __**_

Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.**org 
http://lists.kernelnewbies.**org/mailman/listinfo/**kernelnewbies




XFS locks inode exclusive only if it is an unaligned Direct IO, which is
apparently done to prevent race conditions -- refer to this
http://oss.sgi.com/archives/**xfs/2011-01/msg00157.htmlAlso
 the behavior of Ext4 under dioread_nolock is supported by XFS by
default and in a much better way. Also Ext4 is the only one which uses
DIO_LOCKING while doing direct io.





Regards,
--
Raghavendra Prabhu
GPG Id : 0xD72BE977
Fingerprint: B93F EBCB 8E05 7039 CD3C A4B8 A616 DCA1 D72B E977
www: wnohang.net





Regards,
--
Raghavendra Prabhu
GPG Id : 0xD72BE977
Fingerprint: B93F EBCB 8E05 7039 CD3C A4B8 A616 DCA1 D72B E977
www: wnohang.net


pgpMCSYig7YYA.pgp
Description: PGP signature
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: which local FS supports concurrent direct IO write?

2012-01-15 Thread Zheng Da
Thanks. I was reading the code of kernel 3.0. XFS starts to support
concurrent direct IO since kernel 3.1.5.
But concurrent direct IO write still doesn't work well in kernel 3.2. I
wrote a test program that accesses a 4G file randomly (read and write), and
I ran it with 8 threads and the machine has 8 cores. It turns out that only
1 core is running. I'm pretty sure xfs_rw_ilock is locked
with XFS_IOLOCK_SHARED in xfs_file_dio_aio_write.

lockstat shows me that there is a lot of wait time in ip->i_lock. It seems
the lock is locked exclusively.
   &(&ip->i_lock)->mr_lock-W: 31568  36170
  0.24   20048.25 7589157.99 1301543146848
  0.00 217.70 1238310.72
   &(&ip->i_lock)->mr_lock-R: 11251  11886
  0.24   20043.01 2895595.18  46671 526309
  0.00  63.80  264097.96
   -
 &(&ip->i_lock)->mr_lock  36170
 [] xfs_ilock+0xb2/0x110 [xfs]
 &(&ip->i_lock)->mr_lock  11886
 [] xfs_ilock+0xea/0x110 [xfs]
   -
 &(&ip->i_lock)->mr_lock  38555
 [] xfs_ilock+0xb2/0x110 [xfs]
 &(&ip->i_lock)->mr_lock   9501
 [] xfs_ilock+0xea/0x110 [xfs]

Then I used systemtap to instrument xfs_ilock and find there are at least 3
functions that lock ip->i_lock exclusively during write.
Is there any popular FS that supports concurrent direct IO well?

Thanks,
Da

On Sat, Jan 14, 2012 at 6:45 AM, Raghavendra D Prabhu <
raghu.prabh...@gmail.com> wrote:

> Hi Zheng,
>
>
> * On Fri, Jan 13, 2012 at 04:41:16PM -0500, Zheng Da <
> zhengda1...@gmail.com> wrote:
>
>> Hello,
>>
>> I'm looking for a FS in Linux that supports concurrent direct IO write.
>> ext4 supports concurrent direct IO read if we mount it with
>> dioread_nolock,
>> but doesn't support concurrent writes. XFS doesn't support concurrent
>> direct IO at all. It locks the inode exclusive if it's direct IO. I tried
>> btrfs, and it seems it doesn't support concurrent direct IO either though
>> I
>> haven't looked into its code.
>> Is there a local FS that support concurrent direct IO write? It seems NFS
>> supports it (
>> http://kevinclosson.wordpress.**com/2011/08/12/file-systems-**
>> for-a-database-choose-one-**that-couples-direct-io-and-**
>> concurrent-io-whats-this-have-**to-do-with-nfs-harken-back-5-**
>> 2-years-to-find-out/
>> ),
>> but I'm looking for local FS.
>>
>> Thanks,
>> Da
>>
>
>  __**_
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.**org 
>> http://lists.kernelnewbies.**org/mailman/listinfo/**kernelnewbies
>>
>
> XFS locks inode exclusive only if it is an unaligned Direct IO, which is
> apparently done to prevent race conditions -- refer to this
> http://oss.sgi.com/archives/**xfs/2011-01/msg00157.htmlAlso
>  the behavior of Ext4 under dioread_nolock is supported by XFS by
> default and in a much better way. Also Ext4 is the only one which uses
> DIO_LOCKING while doing direct io.
>
>
>
> Regards,
> --
> Raghavendra Prabhu
> GPG Id : 0xD72BE977
> Fingerprint: B93F EBCB 8E05 7039 CD3C A4B8 A616 DCA1 D72B E977
> www: wnohang.net
>
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: network Driver interface to the stack

2012-01-15 Thread Jonathan Neuschäfer
On Sun, Jan 15, 2012 at 07:15:02PM +0530, V l wrote:
> the packet . Very nice ! But when I compile this code , its going to ask
> where this driver specific senddata is defined , the header file or library

Would you mind posting the exact code and compiler output?


Thanks,
Jonathan Neuschäfer

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


network Driver interface to the stack

2012-01-15 Thread V l
I studied through my network interface driver and gained sufficient
understanding of the network stack.
I have a network stack for my ARM device and now in stage of interfacing
this stack to my network driver in the ARMDevice .
I know now that I have to just replace

for(q = p; q != NULL; q = q->next) {
/* Send the data from the pbuf to the interface, one pbuf at a
   time. The size of the data in each pbuf is kept in the ->len
   variable. */
send data from(q->payload, q->len);
  }


send data from(q->payload, q->len); with my driver specific code to send
the packet . Very nice ! But when I compile this code , its going to ask
where this driver specific senddata is defined , the header file or library
.
How do I resolve this part of the riddle . Its a little newbie kind of
question . But I have tried , I give up now. Plz help.
Regards
Sraddha
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Current Thread mapping

2012-01-15 Thread SaNtosh kuLkarni
Thanks Andi ,,Rohanfor the quick reply

ya you are rite ..i was asking bout kernel side scheduling user space
threads and how they handle them

On Sun, Jan 15, 2012 at 4:02 PM, Andi  wrote:

> **
> Hi Santi,
>
> I'm no sure that I've fully understood the question. What do you mean
> about mapping?
> If you mean kernel/user memory memory mapping, in the kernel there is not
> a mapping of the memory user task in the kernel.
> If with mapping you mean the relation between userspace task and
> scheduling entries, then the relation is 1:1.
> Remember that the kernel doesn't really know that there is a difference
> between process and threads, it's why they are called tasks, threads,
> process at the same way. The only difference is the way you create it in
> userspace. As soon as you create it in your userspace process, the kernel
> links it to a kernel structure called 'struct task_struct' (linux/sched.h)
> which contains many descriptions on the thread/process you have just
> created.
> 1 userpace thread:1 struct task_struct, this is what I mean with 1:1
> mapping
>
> Andi
>
> On 01/15/2012 10:32 AM, SaNtosh kuLkarni wrote:
>
> HI everyone just wanted to know whats the current implementation of user||
> kernel space thread mapping ...is it 1:1 or does it depend on the needs ?
> For example say if i have 12k user space thread running ,,,how many kernel
> space thread would be managing them... as far as i know there is 1:1 mappin
>
>  Thanks
>
>  --
> *Regards,
> Santi*
>
>
> ___
> Kernelnewbies mailing 
> listKernelnewbies@kernelnewbies.orghttp://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
>


-- 
*Regards,
Santosh Kulkarni*
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Current Thread mapping

2012-01-15 Thread Andi

Hi Santi,

I'm no sure that I've fully understood the question. What do you mean 
about mapping?
If you mean kernel/user memory memory mapping, in the kernel there is 
not a mapping of the memory user task in the kernel.
If with mapping you mean the relation between userspace task and 
scheduling entries, then the relation is 1:1.
Remember that the kernel doesn't really know that there is a difference 
between process and threads, it's why they are called tasks, threads, 
process at the same way. The only difference is the way you create it in 
userspace. As soon as you create it in your userspace process, the 
kernel links it to a kernel structure called 'struct task_struct' 
(linux/sched.h) which contains many descriptions on the thread/process 
you have just created.

1 userpace thread:1 struct task_struct, this is what I mean with 1:1 mapping

Andi

On 01/15/2012 10:32 AM, SaNtosh kuLkarni wrote:
HI everyone just wanted to know whats the current implementation of 
user|| kernel space thread mapping ...is it 1:1 or does it depend on 
the needs ? For example say if i have 12k user space thread running 
,,,how many kernel space thread would be managing them... as far as i 
know there is 1:1 mappin


Thanks

--
*Regards,
Santi*


___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Current Thread mapping

2012-01-15 Thread rohan puri
On Sun, Jan 15, 2012 at 3:02 PM, SaNtosh kuLkarni <
santosh.yesop...@gmail.com> wrote:

> HI everyone just wanted to know whats the current implementation of user||
> kernel space thread mapping ...is it 1:1 or does it depend on the needs ?
> For example say if i have 12k user space thread running ,,,how many kernel
> space thread would be managing them... as far as i know there is 1:1 mappin
>
> Thanks
>
> --
> *Regards,
> Santi*
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
Hi,
It basically depends on the threading library's implementation, AFAIK, for
pthreads its 1:1 mapping.

- Rohan
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Current Thread mapping

2012-01-15 Thread SaNtosh kuLkarni
HI everyone just wanted to know whats the current implementation of user||
kernel space thread mapping ...is it 1:1 or does it depend on the needs ?
For example say if i have 12k user space thread running ,,,how many kernel
space thread would be managing them... as far as i know there is 1:1 mappin

Thanks

-- 
*Regards,
Santi*
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies