Re: [ANNOUNCE 0/7] Open-iSCSI/Linux-iSCSI-5 High-Performance Initiator
On Sat, 2005-07-30 at 15:23 -0500, James Bottomley wrote: > On Sat, 2005-07-30 at 12:53 -0700, David S. Miller wrote: > > From: James Bottomley <[EMAIL PROTECTED]> > > Date: Sat, 30 Jul 2005 12:32:42 -0500 > > > > > FIB has taken your netlink number, so I changed it to 32 > > > > MAX_LINKS is 32, so there is no way this reassignment would > > work. > > Actually, I saw this and increased MAX_LINKS as well. I was going to > query all of this on the net-dev mailing list if we'd managed to get the > code compileable. > > > You have to pick something in the range 0 --> 32, and as is > > no surprise, there are no numbers available :-) > > > > Since ethertap has been deleted, 16-->31 could be made allocatable > > once more, but I simply do not want to do that and have the flood > > gates open up for folks allocating random netlink numbers. > > > > Instead, we need to take one of those netlink numbers, and turn > > it into a multiplexable layer that can support an arbitrary > > number of sub-netlink types. Said protocol would need some > > shim header that just says the "sub-netlink" protocol number, > > something as simple as just a "u32", this gets pulled off the > > front of the netlink packet and then it's passed on down to the > > real protocol. > > I'll let the iSCSI people try this ... > > Alternatively, if they don't fancy it, I think the kobject_uevent > mechanism (which already has a netlink number) looks like it might be > amenable for use for most of the things they want to do. In fact, during design phase we've considered to use kobject_uevent() as well but (if i recall correctly), it didn't fit for the simple reason that if we want to have that much code in user-space, than we need to have more control on netlink socket and need to pass binary data back and forth. It would be nice to set MAX_LINKS to 64 and close this issue for now, since I'm pretty sure some other apps might find out kobject_uevent() not suitable for their needs too. Dima - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/7] Open-iSCSI/Linux-iSCSI-5 High-Performance Initiator
On Sat, 2005-07-30 at 15:23 -0500, James Bottomley wrote: On Sat, 2005-07-30 at 12:53 -0700, David S. Miller wrote: From: James Bottomley [EMAIL PROTECTED] Date: Sat, 30 Jul 2005 12:32:42 -0500 FIB has taken your netlink number, so I changed it to 32 MAX_LINKS is 32, so there is no way this reassignment would work. Actually, I saw this and increased MAX_LINKS as well. I was going to query all of this on the net-dev mailing list if we'd managed to get the code compileable. You have to pick something in the range 0 -- 32, and as is no surprise, there are no numbers available :-) Since ethertap has been deleted, 16--31 could be made allocatable once more, but I simply do not want to do that and have the flood gates open up for folks allocating random netlink numbers. Instead, we need to take one of those netlink numbers, and turn it into a multiplexable layer that can support an arbitrary number of sub-netlink types. Said protocol would need some shim header that just says the sub-netlink protocol number, something as simple as just a u32, this gets pulled off the front of the netlink packet and then it's passed on down to the real protocol. I'll let the iSCSI people try this ... Alternatively, if they don't fancy it, I think the kobject_uevent mechanism (which already has a netlink number) looks like it might be amenable for use for most of the things they want to do. In fact, during design phase we've considered to use kobject_uevent() as well but (if i recall correctly), it didn't fit for the simple reason that if we want to have that much code in user-space, than we need to have more control on netlink socket and need to pass binary data back and forth. It would be nice to set MAX_LINKS to 64 and close this issue for now, since I'm pretty sure some other apps might find out kobject_uevent() not suitable for their needs too. Dima - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-iscsi-devel] [ANNOUNCE] open-iscsi and linux-iscsi project teams have merged!
On Thu, 2005-04-14 at 13:16 +0200, Christoph Hellwig wrote: > On Mon, Apr 11, 2005 at 10:30:58PM -0400, linux-iscsi development team wrote: > > The linux-iscsi and open-iscsi developers would like to announce > > that they have combined forces on a single iSCSI initiator effort! > > What SCM will the code be in? I must admit I really, really prefer the > SVN hosting of open-iscsi over the sf.net CVS mess. Consider linux-iscsi-5.x CVS branch as a "mainline". Current open-iscsi SVN repository is the place where all hard-core development will happen at least for the nearest future. I really hope sf.net will provide SVN hosting very soon. than we will see how it goes. and may be we might just migrate current berlios.de hosting to the sf.net. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-iscsi-devel] [ANNOUNCE] open-iscsi and linux-iscsi project teams have merged!
On Thu, 2005-04-14 at 13:16 +0200, Christoph Hellwig wrote: On Mon, Apr 11, 2005 at 10:30:58PM -0400, linux-iscsi development team wrote: The linux-iscsi and open-iscsi developers would like to announce that they have combined forces on a single iSCSI initiator effort! What SCM will the code be in? I must admit I really, really prefer the SVN hosting of open-iscsi over the sf.net CVS mess. Consider linux-iscsi-5.x CVS branch as a mainline. Current open-iscsi SVN repository is the place where all hard-core development will happen at least for the nearest future. I really hope sf.net will provide SVN hosting very soon. than we will see how it goes. and may be we might just migrate current berlios.de hosting to the sf.net. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 2/6] Linux-iSCSI High-Performance Initiator
On Mon, 2005-04-11 at 22:35 -0700, Greg KH wrote: > On Mon, Apr 11, 2005 at 08:24:08PM -0700, Alex Aizman wrote: > > +typedef uint64_t iscsi_snx_t; /* iSCSI Data-Path session > > handle */ > > +typedef uint64_t iscsi_cnx_t; /* iSCSI Data-Path connection > > handle */ > > Do you really have to create a new typedef? Please reconsider. Just > use u64 everywhere, unless you need to do type checking... it is a handle and it is used as a parameter in exported API. yes. type checking exactly the reason. Dima - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 2/6] Linux-iSCSI High-Performance Initiator
On Mon, 2005-04-11 at 22:35 -0700, Greg KH wrote: On Mon, Apr 11, 2005 at 08:24:08PM -0700, Alex Aizman wrote: +typedef uint64_t iscsi_snx_t; /* iSCSI Data-Path session handle */ +typedef uint64_t iscsi_cnx_t; /* iSCSI Data-Path connection handle */ Do you really have to create a new typedef? Please reconsider. Just use u64 everywhere, unless you need to do type checking... it is a handle and it is used as a parameter in exported API. yes. type checking exactly the reason. Dima - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel SCM saga..
On Thu, 2005-04-07 at 13:54 -0400, Daniel Phillips wrote: > Three years ago, there was no fully working open source distributed scm code > base to use as a starting point, so extending BK would have been the only > easy alternative. But since then the situation has changed. There are now > several working code bases to provide a good starting point: Monotone, Arch, > SVK, Bazaar-ng and others. Right. For example, SVK is pretty mature project and very close to 1.0 release now. And it supports all kind of merges including Cherry-Picking Mergeback: http://svk.elixus.org/?MergeFeatures Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel SCM saga..
On Thu, 2005-04-07 at 13:54 -0400, Daniel Phillips wrote: Three years ago, there was no fully working open source distributed scm code base to use as a starting point, so extending BK would have been the only easy alternative. But since then the situation has changed. There are now several working code bases to provide a good starting point: Monotone, Arch, SVK, Bazaar-ng and others. Right. For example, SVK is pretty mature project and very close to 1.0 release now. And it supports all kind of merges including Cherry-Picking Mergeback: http://svk.elixus.org/?MergeFeatures Dmitry - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AOE and large filesystems?
On Tue, 2005-04-05 at 19:07 -0400, Jeff Garzik wrote: > As a tangent, I'd also like to see iSCSI over SCTP. existing iSCSI over TCP ietf draft just does not fit into SCTP. There was some activity on IPS recently: http://www1.ietf.org/mail-archive/web/ips/current/msg01279.html it ends up with needs for new ietf draft which will describe iSCSI over SCTP transport. Dmitry > Jeff > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AOE and large filesystems?
On Tue, 2005-04-05 at 19:07 -0400, Jeff Garzik wrote: As a tangent, I'd also like to see iSCSI over SCTP. existing iSCSI over TCP ietf draft just does not fit into SCTP. There was some activity on IPS recently: http://www1.ietf.org/mail-archive/web/ips/current/msg01279.html it ends up with needs for new ietf draft which will describe iSCSI over SCTP transport. Dmitry Jeff - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oom-killer disable for iscsi/lvm2/multipath userland critical sections
Andrea, I just successfully tested the patch on my environment. It actually resolved OOM-killer problem for my iscsid. Important note: daemon's parent must be init. In my test, OOM-killer killed everything around but iscsid, and iscsid successfully finished registration of new SCSI host in the middle of crazy OOM-killer :) Thanks! Dima On Sat, 2005-04-02 at 00:14 +0200, Andrea Arcangeli wrote: > Hello, > > some private discussion (that was continuing some kernel-summit-discuss > thread) ended in the below patch. I also liked a textual "disable" > instead of value "-17" (internally to the kernel it could be represented > the same way, but the /proc parsing would be more complicated). If you > prefer textual "disable" we can change this of course. > > Comments welcome. > > From: Andrea Arcangeli <[EMAIL PROTECTED]> > Subject: oom killer protection > > iscsi/lvm2/multipath needs guaranteed protection from the oom-killer. > > Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]> > > --- 2.6.12-seccomp/fs/proc/base.c.~1~ 2005-03-25 05:13:28.0 +0100 > +++ 2.6.12-seccomp/fs/proc/base.c 2005-04-01 23:47:22.0 +0200 > @@ -751,7 +751,7 @@ static ssize_t oom_adjust_write(struct f > if (copy_from_user(buffer, buf, count)) > return -EFAULT; > oom_adjust = simple_strtol(buffer, , 0); > - if (oom_adjust < -16 || oom_adjust > 15) > + if ((oom_adjust < -16 || oom_adjust > 15) && oom_adjust != OOM_DISABLE) > return -EINVAL; > if (*end == '\n') > end++; > --- 2.6.12-seccomp/include/linux/mm.h.~1~ 2005-03-25 05:13:28.0 > +0100 > +++ 2.6.12-seccomp/include/linux/mm.h 2005-04-01 23:53:11.0 +0200 > @@ -856,5 +856,8 @@ int in_gate_area_no_task(unsigned long a > #define in_gate_area(task, addr) ({(void)task; in_gate_area_no_task(addr);}) > #endif /* __HAVE_ARCH_GATE_AREA */ > > +/* /proc//oom_adj set to -17 protects from the oom-killer */ > +#define OOM_DISABLE -17 > + > #endif /* __KERNEL__ */ > #endif /* _LINUX_MM_H */ > --- 2.6.12-seccomp/mm/oom_kill.c.~1~ 2005-03-08 01:02:30.0 +0100 > +++ 2.6.12-seccomp/mm/oom_kill.c 2005-04-01 23:46:18.0 +0200 > @@ -145,7 +145,7 @@ static struct task_struct * select_bad_p > do_posix_clock_monotonic_gettime(); > do_each_thread(g, p) > /* skip the init task with pid == 1 */ > - if (p->pid > 1) { > + if (p->pid > 1 && p->oomkilladj != OOM_DISABLE) { > unsigned long points; > > /* > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oom-killer disable for iscsi/lvm2/multipath userland critical sections
Andrea, I just successfully tested the patch on my environment. It actually resolved OOM-killer problem for my iscsid. Important note: daemon's parent must be init. In my test, OOM-killer killed everything around but iscsid, and iscsid successfully finished registration of new SCSI host in the middle of crazy OOM-killer :) Thanks! Dima On Sat, 2005-04-02 at 00:14 +0200, Andrea Arcangeli wrote: Hello, some private discussion (that was continuing some kernel-summit-discuss thread) ended in the below patch. I also liked a textual disable instead of value -17 (internally to the kernel it could be represented the same way, but the /proc parsing would be more complicated). If you prefer textual disable we can change this of course. Comments welcome. From: Andrea Arcangeli [EMAIL PROTECTED] Subject: oom killer protection iscsi/lvm2/multipath needs guaranteed protection from the oom-killer. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] --- 2.6.12-seccomp/fs/proc/base.c.~1~ 2005-03-25 05:13:28.0 +0100 +++ 2.6.12-seccomp/fs/proc/base.c 2005-04-01 23:47:22.0 +0200 @@ -751,7 +751,7 @@ static ssize_t oom_adjust_write(struct f if (copy_from_user(buffer, buf, count)) return -EFAULT; oom_adjust = simple_strtol(buffer, end, 0); - if (oom_adjust -16 || oom_adjust 15) + if ((oom_adjust -16 || oom_adjust 15) oom_adjust != OOM_DISABLE) return -EINVAL; if (*end == '\n') end++; --- 2.6.12-seccomp/include/linux/mm.h.~1~ 2005-03-25 05:13:28.0 +0100 +++ 2.6.12-seccomp/include/linux/mm.h 2005-04-01 23:53:11.0 +0200 @@ -856,5 +856,8 @@ int in_gate_area_no_task(unsigned long a #define in_gate_area(task, addr) ({(void)task; in_gate_area_no_task(addr);}) #endif /* __HAVE_ARCH_GATE_AREA */ +/* /proc/pid/oom_adj set to -17 protects from the oom-killer */ +#define OOM_DISABLE -17 + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ --- 2.6.12-seccomp/mm/oom_kill.c.~1~ 2005-03-08 01:02:30.0 +0100 +++ 2.6.12-seccomp/mm/oom_kill.c 2005-04-01 23:46:18.0 +0200 @@ -145,7 +145,7 @@ static struct task_struct * select_bad_p do_posix_clock_monotonic_gettime(uptime); do_each_thread(g, p) /* skip the init task with pid == 1 */ - if (p-pid 1) { + if (p-pid 1 p-oomkilladj != OOM_DISABLE) { unsigned long points; /* - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
; > > > - highly optimized and small-footprint data path; > > - multiple outstanding R2Ts; > > - thread-less receive; > > - sendpage() based transmit; > > - zero-copy header processing on receive; > > - no data path memory allocations at runtime; > > - persistent configuration database; > > - SendTargets discovery; > > - CHAP; > > - DataSequenceInOrder=No; > > - PDU header Digest; > > - multiple sessions; > > - MC/S (note: disabled in the patch); > > - SCSI-level recovery via Abort Task and session re-open. > > > > > > TODO > > > > > > The near term plan is: test, test, and test. We need to stabilize the > > existing code, after 5 months of development this seems to be the right > > thing to do. > > > > Other short-term plans include: > > > > a) process community feedback, implement comments and apply patches; > > b) cleanup user side of the iSCSI open interface; use API calls > > (instead of > > directly constructing events); > > c) eliminate runtime control path memory allocations (for Nop-In, > > Nop-Out, > > etc.); > > d) implement Write path optimizations (delayed because of the > > self-imposed > > submission deadline); > > e) oProfile the data path, use the reports for further optimization; > > f) complete the readme. > > > > Comments, code reviews, patches - are greatly appreciated! > > > > > > THANKS > > == > > > > Special thanks to our first reviewers: Christoph Hellwig and Mike > > Christie. > > > > Special thanks to Ming Zhang for help in testing and for insightful > > questions. > > > > > > Regards, > > > > Alex Aizman & Dmitry Yusupov > > > > = > > > > The following 6 patches alltogether represent the Open-iSCSI Initiator: > > > > Patch 1: > > SCSI LLDD consists of 3 files: > > - iscsi_if.c (iSCSI open interface over netlink); > > - iscsi_tcp.[ch] (iSCSI transport over TCP/IP). > > > > Patch 2: > > Common header files: > > - iscsi_if.h (iSCSI open interface over netlink); > > - iscsi_proto.h (RFC3720 #defines and types); > > - iscsi_ifev.h (user/kernel events). > > > > Patch 3: > > drivers/scsi/Kconfig changes. > > > > Patch 4: > > drivers/scsi/Makefile changes. > > > > Patch 5: > > include/linux/netlink.h changes (added new protocol NETLINK_ISCSI) > > > > Patch 6: > > Documentation/scsi/iscsi.txt > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
feedback, implement comments and apply patches; b) cleanup user side of the iSCSI open interface; use API calls (instead of directly constructing events); c) eliminate runtime control path memory allocations (for Nop-In, Nop-Out, etc.); d) implement Write path optimizations (delayed because of the self-imposed submission deadline); e) oProfile the data path, use the reports for further optimization; f) complete the readme. Comments, code reviews, patches - are greatly appreciated! THANKS == Special thanks to our first reviewers: Christoph Hellwig and Mike Christie. Special thanks to Ming Zhang for help in testing and for insightful questions. Regards, Alex Aizman Dmitry Yusupov = The following 6 patches alltogether represent the Open-iSCSI Initiator: Patch 1: SCSI LLDD consists of 3 files: - iscsi_if.c (iSCSI open interface over netlink); - iscsi_tcp.[ch] (iSCSI transport over TCP/IP). Patch 2: Common header files: - iscsi_if.h (iSCSI open interface over netlink); - iscsi_proto.h (RFC3720 #defines and types); - iscsi_ifev.h (user/kernel events). Patch 3: drivers/scsi/Kconfig changes. Patch 4: drivers/scsi/Makefile changes. Patch 5: include/linux/netlink.h changes (added new protocol NETLINK_ISCSI) Patch 6: Documentation/scsi/iscsi.txt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
On Thu, 2005-03-10 at 11:27 +0100, Lars Marowsky-Bree wrote: > On 2005-03-09T18:36:37, Alex Aizman <[EMAIL PROTECTED]> wrote: > > >That works well in our current development series, and if you want to > > >share code, you can either rip it off (Open Source, we love ya ;) or we > > >can spin off these parts into a sub-package for you to depend on... > > If it's not a big deal :-) let's do the "sub-package" option. > > I've brought this up on the linux-ha-dev list. When do you need this? For open-iscsi, I think it would make sense to link open-iscs daemon code against klibc. The same way dm-multipath do. This will allow as to build iSCSI remote boot using early user-space. Not sure it will be possible to use your package without modifications. Let me know. Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
On Thu, 2005-03-10 at 11:27 +0100, Lars Marowsky-Bree wrote: On 2005-03-09T18:36:37, Alex Aizman [EMAIL PROTECTED] wrote: That works well in our current development series, and if you want to share code, you can either rip it off (Open Source, we love ya ;) or we can spin off these parts into a sub-package for you to depend on... If it's not a big deal :-) let's do the sub-package option. I've brought this up on the linux-ha-dev list. When do you need this? For open-iscsi, I think it would make sense to link open-iscs daemon code against klibc. The same way dm-multipath do. This will allow as to build iSCSI remote boot using early user-space. Not sure it will be possible to use your package without modifications. Let me know. Dmitry - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
On Tue, 2005-03-08 at 22:50 -0800, Matt Mackall wrote: > On Tue, Mar 08, 2005 at 10:25:58PM -0800, Dmitry Yusupov wrote: > > On Tue, 2005-03-08 at 22:05 -0800, Matt Mackall wrote: > > > On Tue, Mar 08, 2005 at 09:51:39PM -0800, Alex Aizman wrote: > > > > Matt Mackall wrote: > > > > > > > > >How big is the userspace client? > > > > > > > > > Hmm.. x86 executable? source? > > > > > > > > Anyway, there's about 12,000 lines of user space code, and growing. In > > > > the kernel we have approx. 3,300 lines. > > > > > > > > >>- 450MB/sec Read on a single connection (2-way 2.4Ghz Opteron, 64KB > > > > >>block > > > > >>size); > > > > > > > > > >With what network hardware and drives, please? > > > > > > > > > Neterion's 10GbE adapters. RAM disk on the target side. > > > > > > Ahh. > > > > > > Snipped my question about userspace deadlocks - that was the important > > > one. It is in fact why the sfnet one is written as it is - it > > > originally had a userspace component and turned out to be easy to > > > deadlock under load because of it. > > > > As Scott Ferris pointed out, the main reason for deadlock in sfnet was > > blocking behavior of page cache when daemon tried to do filesystem IO, > > namely syslog(). > > That was just one of several problems. And ISTR deciding that > particular one was quite nasty when we first encountered it though I > no longer remember the details. that's bad. since all those details might help us to avoid problems and save time in the future daemon design. I will really appreciate you will point me to other potential problems once you recall. > > > That was 2.4.x kernel. We don't know whether it is > > fixed in 2.6.x. If someone knows, please let us know. Meanwhile we came > > up with work-around design in user-space. "Paged out" problem fixed > > already in our subversion repository by utilizing mlockall() > > syscall. > > I presume this is dynamically linked against glibc? over time it will be linked against klibc as dm-multipath do. It will also help to implement iSCSI boot, when control plane daemon will be part of initramfs image. > > Also we have IMHO, working solution for OOM during ERL=0 TCP re-connect. > > Care to describe it? sure. the idea was to always keep second reserved/redundant TCP connection per session opened. (please note, TCP connection, not iSCSI connection). This way during recovery cycle in case of sane target, initiator will switch into redundant TCP connection and send Login request over. This could be implemented as a feature and might be disabled via configuration utility if needed. Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
On Tue, 2005-03-08 at 22:05 -0800, Matt Mackall wrote: > On Tue, Mar 08, 2005 at 09:51:39PM -0800, Alex Aizman wrote: > > Matt Mackall wrote: > > > > >How big is the userspace client? > > > > > Hmm.. x86 executable? source? > > > > Anyway, there's about 12,000 lines of user space code, and growing. In > > the kernel we have approx. 3,300 lines. > > > > >>- 450MB/sec Read on a single connection (2-way 2.4Ghz Opteron, 64KB block > > >>size); > > > > > >With what network hardware and drives, please? > > > > > Neterion's 10GbE adapters. RAM disk on the target side. > > Ahh. > > Snipped my question about userspace deadlocks - that was the important > one. It is in fact why the sfnet one is written as it is - it > originally had a userspace component and turned out to be easy to > deadlock under load because of it. As Scott Ferris pointed out, the main reason for deadlock in sfnet was blocking behavior of page cache when daemon tried to do filesystem IO, namely syslog(). That was 2.4.x kernel. We don't know whether it is fixed in 2.6.x. If someone knows, please let us know. Meanwhile we came up with work-around design in user-space. "Paged out" problem fixed already in our subversion repository by utilizing mlockall() syscall. Also we have IMHO, working solution for OOM during ERL=0 TCP re-connect. Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
On Tue, 2005-03-08 at 22:05 -0800, Matt Mackall wrote: On Tue, Mar 08, 2005 at 09:51:39PM -0800, Alex Aizman wrote: Matt Mackall wrote: How big is the userspace client? Hmm.. x86 executable? source? Anyway, there's about 12,000 lines of user space code, and growing. In the kernel we have approx. 3,300 lines. - 450MB/sec Read on a single connection (2-way 2.4Ghz Opteron, 64KB block size); With what network hardware and drives, please? Neterion's 10GbE adapters. RAM disk on the target side. Ahh. Snipped my question about userspace deadlocks - that was the important one. It is in fact why the sfnet one is written as it is - it originally had a userspace component and turned out to be easy to deadlock under load because of it. As Scott Ferris pointed out, the main reason for deadlock in sfnet was blocking behavior of page cache when daemon tried to do filesystem IO, namely syslog(). That was 2.4.x kernel. We don't know whether it is fixed in 2.6.x. If someone knows, please let us know. Meanwhile we came up with work-around design in user-space. Paged out problem fixed already in our subversion repository by utilizing mlockall() syscall. Also we have IMHO, working solution for OOM during ERL=0 TCP re-connect. Dmitry - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
On Tue, 2005-03-08 at 22:50 -0800, Matt Mackall wrote: On Tue, Mar 08, 2005 at 10:25:58PM -0800, Dmitry Yusupov wrote: On Tue, 2005-03-08 at 22:05 -0800, Matt Mackall wrote: On Tue, Mar 08, 2005 at 09:51:39PM -0800, Alex Aizman wrote: Matt Mackall wrote: How big is the userspace client? Hmm.. x86 executable? source? Anyway, there's about 12,000 lines of user space code, and growing. In the kernel we have approx. 3,300 lines. - 450MB/sec Read on a single connection (2-way 2.4Ghz Opteron, 64KB block size); With what network hardware and drives, please? Neterion's 10GbE adapters. RAM disk on the target side. Ahh. Snipped my question about userspace deadlocks - that was the important one. It is in fact why the sfnet one is written as it is - it originally had a userspace component and turned out to be easy to deadlock under load because of it. As Scott Ferris pointed out, the main reason for deadlock in sfnet was blocking behavior of page cache when daemon tried to do filesystem IO, namely syslog(). That was just one of several problems. And ISTR deciding that particular one was quite nasty when we first encountered it though I no longer remember the details. that's bad. since all those details might help us to avoid problems and save time in the future daemon design. I will really appreciate you will point me to other potential problems once you recall. That was 2.4.x kernel. We don't know whether it is fixed in 2.6.x. If someone knows, please let us know. Meanwhile we came up with work-around design in user-space. Paged out problem fixed already in our subversion repository by utilizing mlockall() syscall. I presume this is dynamically linked against glibc? over time it will be linked against klibc as dm-multipath do. It will also help to implement iSCSI boot, when control plane daemon will be part of initramfs image. Also we have IMHO, working solution for OOM during ERL=0 TCP re-connect. Care to describe it? sure. the idea was to always keep second reserved/redundant TCP connection per session opened. (please note, TCP connection, not iSCSI connection). This way during recovery cycle in case of sane target, initiator will switch into redundant TCP connection and send Login request over. This could be implemented as a feature and might be disabled via configuration utility if needed. Dmitry - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: UDP optimization
As far as UDP HW acceleration is concerned we need to modify/verify that Linux TCP/IP stack capable of: 1) Partial checksumming on receive 2) Checksumming over fragments on transmit And find the NIC which capable of doing that. s2io/neterion hw do supports those features. Regards, Dima Without those two, I doubt you will On Fri, 2005-02-25 at 14:21 +0330, shabanip wrote: > as i know there are many ways to optimize and tune TCP parameters in kernel > but how can i tune and optimize UDp performance? > thanks, > Payam Shabanian > shabanip -at- avapajoohesh.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: UDP optimization
As far as UDP HW acceleration is concerned we need to modify/verify that Linux TCP/IP stack capable of: 1) Partial checksumming on receive 2) Checksumming over fragments on transmit And find the NIC which capable of doing that. s2io/neterion hw do supports those features. Regards, Dima Without those two, I doubt you will On Fri, 2005-02-25 at 14:21 +0330, shabanip wrote: as i know there are many ways to optimize and tune TCP parameters in kernel but how can i tune and optimize UDp performance? thanks, Payam Shabanian shabanip -at- avapajoohesh.com - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/