On Fri, Jul 21, 2017 at 12:54:15PM +0800, Stu Midgley wrote:
> Afternoon
>
> I have an MDS running on spinning media and wish to migrate it to SSD's.
>
> Lustre 2.9.52
> ZFS 0.7.0-rc3
This may not be a stable combination - I don't think Lustre officially
supports 0.7.0-rc yet. Plus, ther
On Tue, Jun 09, 2015 at 11:10:21AM -0400, Kurt Strosahl wrote:
> Good Morning,
>
>That seems to have done the trick. For the benefit of everyone on this
> list using zfs... The issue I encountered with zfs is described here:
> https://github.com/zfsonlinux/zfs/issues/2523
>
>To resolve
The dnodes are stored in data blocks of the meta_dnode, whose block
size is a fixed constant:
#define DNODE_BLOCK_SHIFT 14 /* 16k */
Again, this is not affected by ZFS recordsize.
-Isaac
On Wed, May 06, 2015 at 09:02:11PM -0600, Isaac Huang wrote:
> On Tue, May 05, 2015 at 05:16:1
On Tue, May 05, 2015 at 05:16:14PM +, Alexander I Kulyavtsev wrote:
> ..
> Shall we use smaller ZFS record size on MDT, say 8KB or 16KB? If inode is
> ~10KB and zfs record 128KB, we are dropping caches and read data we do not
> need.
The ZFS recordsize does not affect Lustre OST/MDT. The
The dnodes are ditto'ed over whatever redundancy the raidz/mirror
already provides, so for 2-way mirrors that'd be multiplied by 4 from
the compressed dnode size. BTW, all ZFS meta-data are compressed by
default. The recent 0.6.4 release supports LZ4 compression of meta
data, which I found in some
Since there's no TRIM support for ZFS on Linux yet, I wonder if
someone has data/experience to share about ZFS on SSD performance as
the SSDs age. Some believe for modern over-provisioned SSDs, lack of
TRIM isn't any big deal but I talked with some SSD developers here
and they all disagreed.
-Isaa
I'm not sure about liblustre, but user space support has already been
removed from Lustre networking stack. I believe that'd eliminate any
chance of FUSE Lustre client.
-Isaac
On Thu, Feb 26, 2015 at 11:59:23AM +0800, 邓尧 wrote:
> The lustre wiki page
> (http://wiki.lustre.org/index.php/LibLustre_
You don't have to wait for Lustre 2.7. The dynamic LNet config feature
will enable configuration of LNet interfaces and other parameters
without reloading the kernel module, but the LNet routes has always
been dynamically configurable with "lctl add_route/del_route".
-Isaac
On Wed, Jan 07, 2015 a
On Tue, Sep 09, 2014 at 05:04:58AM -0600, James Robnett wrote:
>
> I'm having difficulty figuring out a solution to an LNET issue I'm having.
>
> We have two Lustre filesystems separated by about 60 miles, both of
> which have o2ib0(ib0) and tcp(eth0) networks defined. Both have IB
> and TCP cli
On Wed, Jun 18, 2014 at 06:11:33AM -0400, Anjana Kar wrote:
> ..
> Instead we have moved to ldiskfs MDT and zfs OSTs, with the same lustre/zfs
> versions, and have a lot more inodes available.
>
> FilesystemInodes IUsed IFree IUse% Mounted on
> x.x.x.x@o2ib:/iconfs
>
On Thu, Jun 12, 2014 at 04:41:14PM +, Dilger, Andreas wrote:
> It looks like you've already increased arc_meta_limit beyond the default,
> which is c_max / 4. That was critical to performance in our testing.
>
> There is also a patch from Brian that should help performance in your case:
> htt
On Thu, Aug 15, 2013 at 04:09:45PM +0400, Vsevolod Nikonorov wrote:
> ..
> Is Lustre routing something to do with TCP/IP routing? Should I set
> net.ipv4.ip_forward to 1 in sysctl.conf? Should I do some IP masquerade for
> Lustre routing to work properly?
No.
- Isaac
___
On Tue, Feb 26, 2013 at 01:04:06PM -0500, mages, brian wrote:
> Hi,
>
> It appears that I've resolved the issue and therefore wanted to provide an
> update to this list. As I noted in the description of my configuration, the
> client only has a single IB interface. After changing the options f
On Mon, Jan 28, 2013 at 04:23:37PM +0100, Alexander Oltu wrote:
> On Thu, 24 Jan 2013 17:29:00 +0100
> Sébastien Buisson wrote:
>
>
> >
> > In your case, I think it would mean:
> > routes="gni0 xxx.xxx.110.xxx@tcp0 \
> > gni1 xxx.xxx.111.xxx@tcp1"
> >
>
> Looks like this can be a work
On Wed, Jan 23, 2013 at 02:10:54PM +0100, Alexander Oltu wrote:
> ..
> routes="gni0 xxx.xxx.110.xxx@tcp0 xxx.xxx.111.xxx@tcp1"
>
> And getting:
>
> LustreError: 5598:0:(router.c:399:lnet_check_routes()) Routes to gni
> via xxx.xxx.111.xxx@tcp1 and xxx.xxx.110.xxx@tcp not supported
>
> I chec
On Fri, Nov 02, 2012 at 12:04:02AM -0400, Ms. Megan Larko wrote:
> ..
> What steps should I take to generate a successful "lctl ping a.b.c.d"?
There must be a LNet instance running over SOCKLND on a.b.c.d.
- Isaac
___
Lustre-discuss mailing list
Lus
You'll have to write a UDP driver for the Lustre networking stack, not
an easy task.
- Isaac
On Mon, Apr 09, 2012 at 10:44:11PM +, Hebenstreit, Michael wrote:
>
> See title...
>
> Thanks
> Michael
>
>
> Michael Hebens
Hi all,
I'd suggest to start from simple point to point tests. There's too
many variables involved in a 'dd'. Please:
- Do a native IB write test from A to B, of 1M transfers, which is the
max payload per Lustre RPC. With native IB bandwidth test tool, I
remember there used to be an ib_write_
On Sun, Jan 08, 2012 at 10:20:36AM -0700, Andreas Dilger wrote:
> Isaac,
> I'm all in favor of using static code analysis tools to find bugs like this.
> The first step, as you have done is to find and fix the bugs (though with
> proper patches since LASSERT() as a means of error handling is unac
On Sun, Jan 08, 2012 at 09:43:00AM +, Nikitas Angelinas wrote:
> Hi Isaac,
>
> Funny, I was planning to have a look at this, this weekend if time
> permitted. I was interested in finding out how noticeable the issue of
> false positives may be in Coccinelle, but that shouldn't be a big
> probl
Today I decided to try Coccinelle on latest Lustre code found on
master at git://git.whamcloud.com/fs/lustre-release.git.
I came up with a simple Coccinelle script that tries to detect the
case where a new object is allocated and dereferenced without checking
it against NULL.
Eight such bugs were
On Mon, Aug 01, 2011 at 02:52:07PM +0200, Peter Kjellström wrote:
> > > On 2011-07-29, at 11:33 AM, Brock Palen wrote:
> > ..
> > Does that make sense? Is it even right for me to expect that I could
> > combine the performance together and expect full speed in and full speed
> > out if I can c
On Thu, Jul 14, 2011 at 12:43:32PM -0700, Adesanya, Adeyemi wrote:
>
> Just need some clarification on this:
>
> We use the o2ib driver for Lustre IB communication. We also use IPoIB to
> define IP addresses for the IB interfaces in the network. Does the MTU
> configuration parameter impact Lu
On Tue, Jul 12, 2011 at 02:12:38PM -0700, Peter Jones wrote:
> Isaac
>
> If you (or anyone else for that matter) is having trouble joining the
> group let me know privately at pjo...@whamcloud.com which email address
> that you would like to use and I will add you manually.
Thanks Peter, I got
On Tue, Jul 12, 2011 at 11:06:40AM -0700, Rick Wagner wrote:
> On Jul 12, 2011, at 11:01 AM, Isaac Huang wrote:
>
> > On Mon, Jul 11, 2011 at 03:39:34PM -0700, Rick Wagner wrote:
> >> Hi,
> >> ..
> >> I am assuming that -113 is EHOSTUNREACH and -107 is E
On Sun, Jul 03, 2011 at 10:36:46PM +0200, Adrian Ulrich wrote:
>
> > you can subscribe simply by sending an e-mail to
> > wc-discuss+subscr...@googlegroups.com.
>
> This bounces, but sending an e-mail to
> works.
> However: The link in the verification mail will take you to a login page - so
>
On Mon, Jul 11, 2011 at 03:39:34PM -0700, Rick Wagner wrote:
> Hi,
> ..
> I am assuming that -113 is EHOSTUNREACH and -107 is ENOTCONN, and that the
> error codes from errno.h are being used.
>
> We've been experiencing similar problems for a while, and we've never seen IP
> traffic have a p
I think it's TCP/IP according to the process list. It'd help to find
out where the CPU time was spent, e.g. by oprofile.
- Isaac
On Wed, Jul 06, 2011 at 12:14:54PM -0600, Colin Faber wrote:
> Hi,
>
> More details are needed here. What type of interconnect are you using?
> What are your clients
On Tue, Feb 08, 2011 at 05:44:35PM +0100, Ramiro Alba wrote:
> Hi everybody,
>
> We have a 128 nodes (8 cores/node) 4x DDR IB cluster with 2:1
> oversubscription and I use the IB net for:
>
> - OpenMPI
> - Lustre
> - Admin (may change in future)
>
> I'am very interested in using IB QoS, as in th
On Tue, Oct 19, 2010 at 11:05:08AM +0800, liang.whamcloud wrote:
> ..
> to confirm realtime data flow. Of course, It's not difficult to make
> LNet record information like forwarded bytes on each router.
I think it's already recorded - in the second to last field in
"/proc/sys/lnet/stats".
Test Oracle SMTP server connectivity issue - bug 22291
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Mon, Mar 01, 2010 at 02:35:18PM -0500, Oleg Drokin wrote:
> Hello!
>
> On Feb 28, 2010, at 9:31 PM, huangql wrote:
> > We got a problem that the MDS has high load value and the system CPU is up
> > to 60% when running chown command on client. It's strange that the load
> > value and system CP
On Mon, Feb 22, 2010 at 03:22:52AM -0800, Vipul Pandya wrote:
> Hello Issac,
Hi Vipul,
> ..
> I lowered the map_on_demand value to 16 and now it works fine.
>
> However, I had once concern, whether lowering down this map_on_demand
> value would impact the performance of Lustre or not?
For i
On Tue, Feb 23, 2010 at 08:27:40PM +0530, Vineet ghatge wrote:
>Hi all,
>I am trying to get Lustre 1.8 version up and running on fedora
>(standalone for the time being)
> When I try to run the command "lctl network up" I get The following
>error:-
>opening /dev/lnet failed:
On Mon, Feb 15, 2010 at 09:45:10PM -0800, Vipul Pandya wrote:
> ..
> -> I tried to load the ko2iblnd module as you have suggested. But still
> I am unable to do 'lctl ping'. I am getting the same error as shown
> below.
> #> modprobe ko2iblnd map_on_demand=64
Please lower it to "map_on_demand=
On Fri, Feb 12, 2010 at 05:53:19AM -0800, Vipul Pandya wrote:
>..
>#> lctl network up
>LNET configured
>Above command gave me following error in dmesg
>#> dmesg
>
>Lustre: Listener bound to eth2:102.88.88.188:987:cxgb3_0
>Lustre: Register global MR array, MR size: 0
On Thu, Feb 11, 2010 at 03:33:33PM +0100, Sebastian Reitenbach wrote:
> Hi,
>
> in my test system I installed Lustre 1.8.2 from source on a opensuse 10.2
> i386
> (2.6.18.8-0.13-xenpaelustre) as a client. Other clients and the servers are
> running 1.8.1 on SLES 11 x86_64 (2.6.27.39-0.3-xen-lus
On Wed, Jan 27, 2010 at 08:35:30AM -0800, Frank Leers wrote:
> ..
> > Thanks Frank. My questions are for QDR and IB-Bonding with Lustre.
>
> None of this is really QDR-specific, but have a look at :
>
> https://bugzilla.lustre.org/show_bug.cgi?id=20153
> and
> https://bugzilla.lustre.org/sh
On Mon, Nov 16, 2009 at 08:01:12PM -0500, Dardo D Kleiner - CONTRACTOR wrote:
> So are you suggesting I could just comment out the check in router.c?
That's enough for lnet but Lustre changes must also be made.
Isaac
___
Lustre-discuss mailing list
Lust
On Mon, Nov 16, 2009 at 04:38:03PM -0500, Dardo D Kleiner - CONTRACTOR wrote:
> Stand down. Don't know what was wrong with my configuration at first,
> but it does instantiate the two NIDs on the host with multiple ports
> on a single HCA. Unfortunately,
>
> LustreError: 17771:0:(router.c:464:ln
On Mon, Nov 16, 2009 at 02:51:01PM -0700, Lundgren, Andrew wrote:
>Is there a command to pull the addresses of every device connected to a
>cluster?
>
>I have found:
>
>lct -net tcp [peer_list | conn_list]
This would only show immediate peers, i.e. next-hop peers. In a routed
con
On Fri, Nov 13, 2009 at 03:34:14PM -0500, Dardo D Kleiner - CONTRACTOR wrote:
> Mellanox ConnectX MT25418, two ports, each connected to a separate
> IB fabric - ib0 and ib1 have distinct IP subnets, each connected
> to a separate Lustre router.
> ..
> ip ad ls:
> 4: ib0: mtu 65520 qdisc pfifo_
On Thu, Nov 12, 2009 at 12:47:33PM -0500, Brian J. Murrell wrote:
> On Thu, 2009-11-12 at 10:37 +, Chris Exton wrote:
> > I am having a few problems with Lustre and I can???t seem to find the
> > answer to my problem on the web so I wondered if you could help?
>
> You have networking problems
On Wed, Nov 11, 2009 at 04:07:39PM -0600, Daneil Goodman wrote:
>Hello list,
>By searching the archive, I found a similar message dated back in
>January 2008 -- How do you make an MGS/OSS listen on 2 NICs? Looks like
>there is no final solution and I am facing the similar situation
On Tue, Nov 10, 2009 at 08:02:03AM -0500, Dardo D Kleiner - CONTRACTOR wrote:
> ..
> At this point it clearly doesn't matter if I mess with max_rpcs_in_flight
> which used
> to be a way to mitigate the high BDP.
>
> Are there new parameters and/or tunings for ko2iblnd we're supposed to be
>
On Mon, Nov 09, 2009 at 10:55:34PM -0800, Eric Adint wrote:
> OK i have read the manual and i have read the boards and done as much
> research as i can, but i cant seem to bend my head around this, what i
> want to do is create a router so that i can keep my OST and MGS/MDT on
> the IB networ
On Fri, Nov 06, 2009 at 12:34:34AM +0100, Piotr Wadas wrote:
> ..
> --
> options lnet networks=tcp0
> --
When an interface name has been omitted, the lnet would iterate over
the list of system IP interfaces (by SIOCGIFCONF) and choose the 1st
one whose status is "up" (SIOCGIFFLAGS) and has bee
On Mon, Nov 09, 2009 at 02:48:34PM +0100, Heiko Schröter wrote:
> Hello,
>
> we do encounter peaks of upto 30% package loss in our Gigabit Network.
It would be helpful if you'd elaborate on where the 30% came from.
> This is sporadic, say once every hour remaining for some seconds. We cannot
>
On Mon, Sep 07, 2009 at 06:58:39PM +0100, Wojciech Turek wrote:
>Hi,
>I am designing lustre file system that will be serving two separate
>clusters. One of the clusters is old and uses Ethernet data network.
>Second of the clusters is new and uses QDR IB data network. I would
>l
On Wed, Aug 26, 2009 at 06:52:24PM -0700, Abe Ingersoll wrote:
>..
>kiblnd_tx_complete()) Tx -> 10.168.22@o2ib cookie 0xc8dd6 sending 1
>waiting 1: failed 12
12 == IB_WC_RETRY_EXC_ERR, which usually indicates faulty links in the
network or some other application (like a MPI app
On Mon, Aug 17, 2009 at 12:23:35PM -0400, Charles A. Taylor wrote:
> FWIW, I posted this to ofa-general a little earlier. Anyone else
> seeing this?Suggestions?I think this is an OFED 1.4.1 problem
> but they may point the finger at you guys. :)
>
> We've tried limiting OST threads to n
On Mon, Aug 10, 2009 at 03:39:52PM +0200, Wolfgang Stief wrote:
> Hi out there!
>
> Before I start installing and fiddling around: Are there any reasons
> AGAINST setting up a Lustre playground in a VirtualBox environment? I
> just want to play around w/ recovery and debugging situations and
> upg
On Mon, Aug 10, 2009 at 03:56:13PM +0800, Lee Amy wrote:
> ..
> It seems this method cannot solve my problem. My NID is
> 10.0.38@tcp, and furthermore when I add the item
>
> options lnet network=tcp0(eth1)
>
> I still encountered the same problem and after this failure I change
> this it
On Fri, Jul 31, 2009 at 10:52:46AM -0600, Daniel Kulinski wrote:
>Unmounting lustre when our heartbeat software was misconfigured (IPMI
>password changed).
>
>
>tx1oss3-clusternet kernel: LustreError:
>19350:0:(quota_context.c:1369:lqs_exit())
>ASSERTION(atomic_read(&q->lqs_re
On Tue, Jul 28, 2009 at 02:24:12PM -0600, Daniel Kulinski wrote:
>I have read the very brief section on changing NIDs in the Lustre
>Manual.
In the attachments of bug 18231 you may find more information on
changing server NIDs:
https://bugzilla.lustre.org/show_bug.cgi?id=18231
I'm not sur
Have you tried irc.lustre.org instead of zone.lustre.org?
Isaac
On Tue, Jul 14, 2009 at 11:11:26AM -0700, Frank Leers wrote:
> Anybody in the know about the ETA of the lustre IRC server coming back
> up?
___
Lustre-discuss mailing list
Lustre-discuss@
On Tue, Jul 07, 2009 at 11:44:39AM -0400, Isaac Huang wrote:
> ..
> > If I would attach the OSS with a single 10GbE link, could
> > a client then use the second link, when striping over targets
> > on same OSS?
>
> There's a rather complex way of static con
On Tue, Jul 07, 2009 at 03:44:32PM +0200, Ralf Utermann wrote:
> Dear list,
>
> we have setup of OSS and some clients with a dual Gigabit
> trunk (miimon=100 mode=802.3ad xmit_hash_policy=layer3+4).
If I understand it correctly, xmit_hash_policy=layer3+4 would not
allow a single TCP connection to
On Wed, Jul 01, 2009 at 02:07:33AM -0400, Isaac Huang wrote:
> ..
> >> For your current concern of setting up different SLs, I'd believe that
> >> it could be achieved via target GUIDs as mentioned in my previous reply.
> >
> > Unfortunately, configuring I
On Fri, Jun 26, 2009 at 01:42:53PM +0200, S?bastien Buisson wrote:
>
> Isaac Huang a ?crit :
>> On Wed, Jun 24, 2009 at 09:46:19AM +0200, S?bastien Buisson wrote:
>>> ..
>>> The peer's port information could be stored in the kib_peer_t
>>> struct
On Wed, Jun 24, 2009 at 09:46:19AM +0200, S?bastien Buisson wrote:
> ..
> The peer's port information could be stored in the kib_peer_t structure.
> That way, it would be possible to make clients connect to servers which
> listen on different ports.
> What do you think?
At this point it ca
On Mon, Jun 22, 2009 at 04:49:03PM +0200, S?bastien Buisson wrote:
> ..
> Let's consider we have two sets of OSSes, each set serving a different
> Lustre file system (i.e. all the OSTs of an OSS are part of the same
> Lustre file system). The same Lustre clients have access to both
> filesys
On Fri, Jun 19, 2009 at 08:43:11AM -0400, Michael Di Domenico wrote:
> > ..
> > Have you changed server NIDs without updating configuration logs with
> > --writeconf?
>
> By accident the lnet configs came up with the 192.168.0.x config
> because a modprobe setting was wrong. However, i took t
On Thu, Jun 18, 2009 at 09:51:33PM -0400, Michael Di Domenico wrote:
> ..
> > But the connection was rejected because the server didn't have
> > 192.168.0@tcp as one of its NIDs.
> >
> > What was your mount command line? What does 'lctl list_nids' say on
> > the nodes?
>
> list_nids show t
On Thu, Jun 18, 2009 at 09:11:50PM -0400, Michael Di Domenico wrote:
> I cannot figure out what exactly has happened here and how to recover from it.
>
> Jun 18 21:02:52 node0-eth1 kernel: LustreError:
> 2722:0:(socklnd_cb.c:2156:ksocknal_recv_hello()) Error -104 reading
> HELLO from 192.168.0.248
On Tue, Jun 16, 2009 at 12:57:27PM +0200, Tom Woezel wrote:
>..
>Jun 16 04:33:38 sososd1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
>..
>Jun 16 04:33:38 sososd1 kernel: Call Trace:
>Jun 16 04:33:38 sososd1 kernel:[]
>:bonding:ad_rx_machine+0x20/0x502
>Ju
On Thu, Jun 11, 2009 at 10:51:01PM -0400, Erik Froese wrote:
>OK here's where I am now.
>
>The public client can ping the routers public address but not the
>private address.
>
>[r...@routed-client lnet]$ cat /etc/modprobe.conf
>..
>options lnet accept=all
This would
On Tue, Jun 09, 2009 at 02:36:37PM +0200, Arne Wiebalck wrote:
> Dear all,
>
> I set up an 2.0-alpha2 system and planned to populate it with
> 100 million files. While populating it however, the MDS ran
> out of memory, the OOM kicked in, killed some processes, and
> all ended in a kernel panic.
>
On Thu, Jun 04, 2009 at 01:59:48PM -0400, Erik Froese wrote:
>Thanks Andreas and Natalie,
>
>I've made the changes you suggested (setting tcp1 as the external
>network) and I'm able to lctl ping the 128.122.x.y address but I still
>cannot ping the private address for the MDS.
Plea
On Wed, Jun 03, 2009 at 05:45:10PM -0400, Erik Froese wrote:
>..
>I don't see it sending any traffic to the router with tcpdump running
>on the router.
Alternatively, you may run 'routerstat 1' on the router to see how
much data is being forwarded per second.
Isaac
___
On Sat, May 23, 2009 at 09:18:43PM +0400, Alexey Lyashkov wrote:
> Hi Michael,
>
>
> On Fri, 2009-05-22 at 16:38 -0400, Michael D. Seymour wrote:
> > Hi all,
> >
> > One client running CentOS 5.2 re-exports the Lustre filesystem via NFS on a
> > different network.
> >
> > We get the following
On Tue, May 19, 2009 at 05:55:21PM +0200, S??bastien Buisson wrote:
> Hi,
>
> We took a slightly different approach to deal with IB QoS in Lustre.
>
> We decided to assign a specific service-id to Lustre: in ofa-kernel we
> added a new value in the rdma_port_space enum, that we called
> RDMA_PS
On Mon, May 18, 2009 at 12:04:37PM +0200, Daniel Kobras wrote:
> Hi!
>
> Does anyone know how to use QoS with Lustre's o2ib LND? The Voltaire IB
> LND allowed to #define a service level, but I couldn't find a similar
> facility in o2ib. Is there a different way to apply QoS rules?
The o2iblnd SL
On Thu, May 07, 2009 at 03:02:49PM -0700, Klaus Steden wrote:
> ..
> I didn't even touch Lustre bonding, because as you both remark, it's a
> little convoluted. I spent a lot of time experimenting with Lustre over
> 802.3ad (LACP) aggregated links using the Linux bonding driver, and my OSS
> no
On Thu, May 07, 2009 at 02:50:13PM +0200, Michael Ruepp wrote:
> Hi there,
> ..
> I give every NID a IP in the same subnet, eg: 10.111.20.35-38 - oss0
> and 10.111.20.39-42 oss1
>
> Do I have to make modprobe.conf.local look like this to force lustre
> to use all four interfaces parallel:
On Mon, Apr 27, 2009 at 12:21:41PM -0600, Nathan Dauchy wrote:
> Greetings,
>
> Does Lustre's o2ib LND take advantage of Infiniband's LID Mask Count
> (LMC) capability? Might it be included in the future? I'm looking for
> something similar to the "MV2_USE_HSAM=1" option for Hot-Spot Avoidance
>
On Fri, Apr 24, 2009 at 09:38:13AM +1000, Andrew Brooker wrote:
> I'm having some difficulty with a slightly more complicated multihomed TCP
> based LNET.
> Here is what I would like to achieve, have a single MGS/MDT server that
> lives on two physically separate IP networks. Be able to add OSTs fr
On Wed, Mar 25, 2009 at 04:47:21PM -0700, Adam Gandelman wrote:
> Hi list-
> ..
> On all nodes: Linux 2.6.18-92.1.10.el5_lustre.1.6.6smp #1 SMP Tue Aug 26
> 12:05:09 EDT 2008 i686 i686 i386 GNU/Linux
>
> BUG: soft lockup - CPU#0 stuck for 10s [socknal_cd00:2785]
It smells to me like an after
On Thu, Mar 12, 2009 at 03:29:40PM +, Gerd wrote:
> Hi,
>
> We have a 1.6.6 installation using InfiniBand attached DDN OST storage
> and OSS'es connected to the network with 10GE adapters. When running
> iozone with ~40 1GE attached clients we see the following on the clients:
> ..
> And
On Tue, Feb 24, 2009 at 09:38:42PM -0600, Hendelman, Rob wrote:
> ..
> I ended up with lots of problems and did end up hitting a few lbug's,
> specifically:
>
> LustreError: 11283:0: (tracefile.c:431:libcfs_assertion_failed()) LBUG
> LustreError: 8095:0: (tracefile.c:431:libcfs_aertion_fai
You might find this interesting:
http://www.cse.ohio-state.edu/~panda/temp/ib_10ge_advanced.pdf
Isaac
On Wed, Feb 11, 2009 at 2:08 PM, Jeffrey Bennett wrote:
> Hi,
>
> Has anybody done any performance comparison between Lustre with 10GbE and
> Lustre with Infiniband 4X SDR? I wonder if they per
On Thu, Feb 12, 2009 at 08:26:09AM -0500, Scott Atchley wrote:
>> ..
>> One exception is SOCKLND on Chelsio's T3, quote:
>>
>> "The T3 ASIC uses the mechanism of Direct Data Placement (DDP) that
>> provides a flexible zero copy on receive capability for regular TCP
>> connections, requiring no
On Wed, Feb 11, 2009 at 04:35:47PM -0500, Scott Atchley wrote:
> ..
> SOCKLND is limited by a copy on the receive side. When a client
> writes, the server has to copy the data out. When a client reads, it
> ..
One exception is SOCKLND on Chelsio's T3, quote:
"The T3 ASIC uses the mech
On Wed, Feb 11, 2009 at 06:11:30PM -0500, Charles Taylor wrote:
> ..
> Just ran a quick IMB (formerly Pallas) between a couple of our SDR
> nodes and got 860 MBytes/sec (ping-pong, 4MB). So I don't think
> there is anything inherent in SDR IB that limits you to 750 MBytes/
> sec. Howev
On Mon, Feb 09, 2009 at 04:52:20PM +0100, G?tz Waschk wrote:
> Hello everyone,
> .
> My client has this in modprobe.conf:
> options lnet networks=o2ib,tcp
> I'm trying to mount the remote network with
> mount -t lustre 141.34.228...@tcp0:/atlas /scratch/lustre-1.6/atlas
> and the command just h
On Tue, Dec 23, 2008 at 06:45:09AM -0700, Denise Hummel wrote:
> Hi;
>
> Thanks. I have suspected the network, however have not been able to
> pinpoint the problem. I have looked at the ethernet and infiniband
> switches - found a few with IGMP turned on and some multicast issues.
> Those have b
On Fri, Dec 19, 2008 at 08:42:16AM -0700, Denise Hummel wrote:
> Hi;
>
> I have started getting numerous dump logs, timeouts and client
> evictions. Our environment:
> ..
> Dec 19 04:17:28 oss1 kernel: Lustre: 27065:0:(router.c:167:lnet_notify())
> Ignoring prediction from 172.16.100...@tcp
On Thu, Dec 18, 2008 at 10:30:42PM -0800, Arden Wiebe wrote:
> ..
> [r...@lustreone src]# uname -a
> Linux lustreone.linuxguru.ca 2.6.18-92.1.10.el5_lustre.1.6.6smp #1 SMP Tue
> Aug 26 12:16:17 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
>
> [r...@lustreone ~]# rpm -qa kernel\* | sort
> kernel-de
We're now doing researches and a design draft shall be ready for
public review (at lustre-devel) at the beginning of next January.
Isaac
On Wed, Dec 17, 2008 at 05:51:12PM -0600, Mike Feuerstein wrote:
>Is support for network failover to an alternate IB port on the Lustre
>roadmap
>
>
On Mon, Dec 15, 2008 at 10:01:08AM +0800, Lu Wang wrote:
> Dear list,
> There are two Ethernet Cards on our login node, one outside
> connection(202.122.*.*), one for inside connection to other servers. The
> problem is Lustre sometimes confuse with configuration.
> [r...@lxslc09 ~]# netst
On Thu, Nov 13, 2008 at 04:18:02PM -0800, Joseph Farran wrote:
> ..
> # lctl list_nids
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]
>
> How can I get Lustre to use both ib0 and bond0 (eth0 / eth1) for the
> data nework? Currently it only uses Infiband (ib0) and not bond0.
You may find this us
On Mon, Oct 20, 2008 at 01:20:46PM +0200, Danny Sternkopf wrote:
> >> Beside of TCP it is only possible to use multiple interfaces on the same
> >> node with o2ib, right? With ko2iblnd one can setup several Lustre
> >> networks for each IB interface. In fact you must setup several Lustre
> >> netwo
On Mon, Oct 13, 2008 at 05:29:14PM +0200, Danny Sternkopf wrote:
> ..
> Interesting is how to use multiple interfaces on the same server in
> Lustre/LNet. My understanding is that TCP(ksocklnd) can manage multiple
> physical interfaces as one LNet interface with one unique NID. Is that
> still
On Sun, Oct 12, 2008 at 10:15:01AM -0400, Brock Palen wrote:
> ..
> Currently we don't put any lustre modules in modprobe.conf, lustre
> loads the correct modules when mounting the filesystem. We do this
> to keep our loads simple as we have several.
When nothing has been specified, LNet
On Tue, Oct 07, 2008 at 11:00:20PM -0600, Andreas Dilger wrote:
> On Oct 07, 2008 22:58 -0400, Mag Gam wrote:
> > My intention was I wanted to see if my lustre connection is being
> > routed thru other interfaces. I have 4 interfaces on my server: eth0
> > thru eth4. eth0 is used for Lustre but it
On Wed, Sep 24, 2008 at 05:22:55PM -0600, Nathan Dauchy wrote:
> Can anyone direct me to documentation to decipher these messages?
> What does "server_bulk_callback" do, and does "status -103" indicate a
> severe problem for event types 2 and 4?
server_bulk_callback signals the completion of bulk
On Tue, Sep 09, 2008 at 01:55:46PM +0900, Alex Lee wrote:
> I been seeing something that looks like IB timeout errors lately after
> upgrading to 1.6.5.1 using the supplied ofed kernel drivers.
> ..
> Sep 9 00:25:31 lustre-oss-4-1 kernel: LustreError:
> 13228:0:(o2iblnd_cb.c:2874:kiblnd_chec
http://www.mail-archive.com/[EMAIL PROTECTED]/msg00491.html
Isaac
On Fri, Aug 15, 2008 at 07:16:23AM -0400, Mag Gam wrote:
> I am doing a case study at my university and I am trying to analyze
> packets for LNET. I want to compare this with other Network based
> filesystems, such as NFS and SMB.
On Tue, Jun 03, 2008 at 08:32:09AM -0400, Murray Smigel wrote:
>Some additional information on the problem. I tried disconnecting the
>ethernet connection to
>the server machine (192.168.1.94) and tried running a disk test on the
>client (192.168.1.156 via ethernet), writing to
>
Since both the configuration and the IB link bandwidth looked fine,
I'd suggest to measure lnet throughput by lnet selftest:
1. On both client and server: modprobe lnet_selftest
2. On the client:
export LST_SESSION=$$
lst new_session --timeo 10 test
lst add_group s [EMAIL PROTECTED]
lst add_gr
1 - 100 of 117 matches
Mail list logo