Re: [Lustre-discuss] OST redundancy between nodes?

2009-07-13 Thread Carlos Santana
Comments in-line.
-
CS.


On Fri, Jun 26, 2009 at 1:09 PM, Brian J. Murrell wrote:
> On Fri, 2009-06-26 at 11:51 -0600, Kevin Van Maren wrote:
>> If an OST "fails", meaning that the underlying HW has failed (or the
>> connection to the storage has failed -- one reason to use multipath IO),
>> then Lustre will return IO errors to the application (although there is
>> an RFE to not do that).
>
> This is not entirely true.  It is only true when an OST is configured as
> "failout".  When an OST is configured as failover however (which is the
> typical case), the application just blocks until the OST can be put back
> into service again on any of the defined failover nodes for that OST and
> the client can reconnect.  At that time, pending operations are resumed
> and the application continues.
>

The application does not block for all commands. For example, lfs df
would work and so does new file creation (if you have another OST
running). However, querying disk space such df or ls will fail. And
this fails even after deactivating OST on MDS.


>> Normally what happens is the OSS _node_ fails,
>> and the other node mounts the OST (typically done by using
>> Linux-HA/Heartbeat).
>
> Right.  And no applications see any errors while this happens.
>
> And it is worth noting that defining an OST for failover does not
> require that more than one OSS be defined for it.  You can provide
> "failover service" (i.e. no EIOs to clients) using a single OSS.  If it
> dies, then clients just block until it can be repaired.
>
> b.
>
>
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] file system shrinking and OST failure

2009-07-13 Thread Carlos Santana
All right, so we can shrink the file system. The manual has useful
info about OST failure/removal. I have a few related questions about
it.

The manual has a note in failover chapter 8-4 for stopping client
process which waits indefinitely saying - "the OST is explicitly
marked "inactive" on the clients: lctl --device  deactivate". But, a note in chapter 4-18 says "Do not
deactivate the OST on the clients. Do so causes errors (EIOs), and the
copy out to fail.". This is a bit confusing. So what should we do when
an OST fails? and when should we deactivate OST (or to be precide
OSC?) on client?

Could you please elaborate more on configuring failover while making a
new filesystem? The mkfs.lustre command does not have --failover
switch, but rather has --failnode switch. So we just need to specify
'--failnode='  or anything else?
What is the correct method?

And do we need to configure this (spare) OST for the file system and
be it active/mounted while running above mkfs.lustre command?

-
CS.


On Mon, Jul 13, 2009 at 12:56 PM, Brian J. Murrell wrote:
> On Mon, 2009-07-13 at 12:51 -0500, Carlos Santana wrote:
>> Does lustre support shrinking of file system size - online or offline?
>> I read online is not supported, but I couldn't find any info for the
>> offline shrinking. My guess is that it is not supported. Please
>> correct me if I am wrong.
>
> You can shrink the filesystem by simply removing an OST.  Of course, if
> there are objects on that OST, you need to move them off first, or you
> will lose the files, (or parts of files in the case of striped files)
> those objects are members of.
>
> All of the manual, this list and bugzilla have discussed how to move
> files off an OST in pretty great detail.  Please check them for details
> on how.
>
> b.
>
>
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] file system shrinking and OST failure

2009-07-13 Thread Carlos Santana
Does lustre support shrinking of file system size - online or offline?
I read online is not supported, but I couldn't find any info for the
offline shrinking. My guess is that it is not supported. Please
correct me if I am wrong.

Can OST failure/removal be related to shrinking?

Thanks,
CS.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] failover software - heartbeat

2009-07-13 Thread Carlos Santana
Howdy,

The lustre manual recommends heartbeat for handling failover. The
pacemaker is successor of hearbeat version 2. So whats recommended -
should we be using pacemaker or stick to hearbeat?

-
CS.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MGS - one per site

2009-06-26 Thread Carlos Santana
Lustre 1.8 manual PDF -> 4.4.Operational Scenarios -> 'IP Network,
Single MDS, Single OST, No Failover' (page# 106)
Seems to be mistake in 1.6 PDF as well, but not in HTML.

Thanks for clarifying.

~
CS.

On Fri, Jun 26, 2009 at 1:23 PM, Brian J. Murrell wrote:
> On Fri, 2009-06-26 at 13:18 -0500, Carlos Santana wrote:
>>
>> I know we need to specify --mgsnode, however I saw target type '--mgs'
>> being specified on MGS and OSS.
>
> This would be an error.  Where did you see it?  If you saw it in some
> text from Sun, you can file a bug at our bugzilla.
>
>> May be it was for failover.
>
> No.  Specifying --mgs does nothing for failover on an OSS.
>
> b.
>
>
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] OST redundancy between nodes?

2009-06-26 Thread Carlos Santana
On Fri, Jun 26, 2009 at 1:21 PM, Brian J. Murrell wrote:
> On Fri, 2009-06-26 at 13:15 -0500, Carlos Santana wrote:
>>
>> Yeah, this is what I am curious abt - OST/disk/storage-device failure.
>
> If the media (i.e. physical disk) that is an OST fails, then there is
> nothing Lustre can do to recover it.  This is why we strongly suggest
> OSTs be some form of RAID.  Lustre absolutely assumes that the storage
> is reliable and adds no additional redundancy to/for OSTs or the MDT.
>
Yeah, this was answered by Kevin in the beginning of this thread. My
question was what will be the message/error given to the client.

Also, I did not understand OSS failure and OST failure terms were used
interchangeably.

The 'failover' term seems appropriate when talking abt servers and not
targets.

> The bottom line -- your data is only as safe as the disks (virtual or
> physical) that you give to Lustre.
>
> b.
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>

~
Thanks,
CS.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MGS - one per site

2009-06-26 Thread Carlos Santana
On Fri, Jun 26, 2009 at 10:57 AM, Brian J. Murrell wrote:
> On Fri, 2009-06-26 at 10:52 -0500, Carlos Santana wrote:
>> Can a lustre file system have more than one MGS?
>
> No.  It can have multiple failover paths to a single MGS, but max. 1 MGS
> per filesystem.
>
>> Isn't it only one per
>> site?
>
> It can be, and that's how it was designed.  IOW, a single MGS can serve
> multiple filesystems.
>
>> I saw some examples where target type mgs was mentioned during
>> mkfs.lustre for MDS and OSS nodes.
>
> Yes.  You have to tell the MDSes and OSSes (and clients, via their mount
> target) where the MGS(es, for failover) are.
I know we need to specify --mgsnode, however I saw target type '--mgs'
being specified on MGS and OSS. May be it was for failover.

>
> b.
>
>
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] OST redundancy between nodes?

2009-06-26 Thread Carlos Santana
On Fri, Jun 26, 2009 at 12:51 PM, Kevin Van Maren wrote:
> OSS is the server.  It normally provides one or more OSTs.
>
> OST failover is done by configuring multiple OSS nodes to be able to serve
> the same OST.  Only ONE OSS node may provide the OST at a time.
>
I understand that OST can't be shared by two or more active OSSs at a
time. But we can/should configure OSSs for failover mode. In my
interpretation OST failure was a disk/storage failure. So the failover
you are referring to was an OSS failover in my understanding (i.e.,
switch to another failover OSS node, if particular OSS fails) .

> Failover is accomplished by the clients attempting to connect to each OSS
> node configured to serve the OST, until one of them responds with it active.
>
>
> An OST can be moved back-and-forth between OSS nodes by umount/mount
> commands (assuming both servers can access the same disk!)
>
> If an OST "fails", meaning that the underlying HW has failed (or the
> connection to the storage has failed -- one reason to use multipath IO),
> then Lustre will return IO errors to the application (although there is an
> RFE to not do that).  Normally what happens is the OSS _node_ fails, and the
> other node mounts the OST (typically done by using Linux-HA/Heartbeat).
>

Yeah, this is what I am curious abt - OST/disk/storage-device failure.
It might be nice to have something on wiki regarding server and target
as separate entities or same machine. I have gone through the FAQ
entry, but it would be great if we could elaborate it further.

>
> MDS/MDT failover/configuration is similar.
>
> Kevin
>
>
>
> Carlos Santana wrote:
>>
>> Sorry, but may be I am confused between OSS and OST.
>>
>> On Fri, Jun 26, 2009 at 11:24 AM, Brian J. Murrell
>> wrote:
>>
>>>
>>> On Fri, 2009-06-26 at 10:56 -0500, Carlos Santana wrote:
>>>
>>>>
>>>> I was wondering what will happen during OST failure
>>>>  - if client is making some read/write operation
>>>>
>>>
>>> Assuming the OST is configured for failover, the client will retry
>>> anything that didn't get committed to disk before the OST failure.  It
>>> will try with all available failover targets for the OST.
>>>
>>
>> Can OST(disk) be configured for failover like an OSS(server node)?
>>
>>
>>>>
>>>> - if client requests read/write after OST fails
>>>>
>>>
>>> Same as above.
>>>
>>>
>>>>
>>>> When I made OSS unavailable the client waited/got delayed response
>>>> till OSS connected back.
>>>>
>>>
>>> Right.  That's failover.
>>>
>>>
>>>>
>>>> I am not sure about OST failure though. Any
>>>> clues?
>>>>
>>>
>>> An OST fails if an OSS fails given that an OST is the disk in an OSS
>>> (which is the node).
>>>
>>
>> I thought an OST(disk) can fail without OSS(server) being failed.
>> And that's my question, what will happen in such scenario - while
>> client is in read/write operation and client requesting read/write
>> after the OST(disk) failure?
>>
>>
>>>
>>> b.
>>>
>>>
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] OST redundancy between nodes?

2009-06-26 Thread Carlos Santana
Sorry, but may be I am confused between OSS and OST.

On Fri, Jun 26, 2009 at 11:24 AM, Brian J. Murrell wrote:
> On Fri, 2009-06-26 at 10:56 -0500, Carlos Santana wrote:
>>
>> I was wondering what will happen during OST failure
>>  - if client is making some read/write operation
>
> Assuming the OST is configured for failover, the client will retry
> anything that didn't get committed to disk before the OST failure.  It
> will try with all available failover targets for the OST.

Can OST(disk) be configured for failover like an OSS(server node)?

>
>> - if client requests read/write after OST fails
>
> Same as above.
>
>> When I made OSS unavailable the client waited/got delayed response
>> till OSS connected back.
>
> Right.  That's failover.
>
>> I am not sure about OST failure though. Any
>> clues?
>
> An OST fails if an OSS fails given that an OST is the disk in an OSS
> (which is the node).

I thought an OST(disk) can fail without OSS(server) being failed.
And that's my question, what will happen in such scenario - while
client is in read/write operation and client requesting read/write
after the OST(disk) failure?

>
> b.
>
>
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>

~
CS.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] OST redundancy between nodes?

2009-06-26 Thread Carlos Santana
Thanks Brian.

I was wondering what will happen during OST failure
 - if client is making some read/write operation
- if client requests read/write after OST fails

When I made OSS unavailable the client waited/got delayed response
till OSS connected back. I am not sure about OST failure though. Any
clues?

-
CS.


On Thu, Jun 25, 2009 at 10:34 AM, Brian J. Murrell wrote:
> On Thu, 2009-06-25 at 10:21 -0500, Carlos Santana wrote:
>>
>> I am confused about this. Will the files in that OST be unavailable or
>> some of the files in that filesystem be unavailable?
>
> Both.  An OST contains objects.  For singly striped files (the default),
> a single object is the entire file (data).  So losing an OST means
> losing the object which means losing the file (contents).
>
>> My impression is that lustre would stripe file data across many OSTs
>> in terms of objects.
>
> It *may*.  By default it does not.
>
>> So wouldn't failure of one OST will potentially
>> corrupt the files which have stripes/objects stored over that OST?
>
> Yes.  This is the other side of the "both" I mentioned above.
>
> b.
>
>
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] MGS - one per site

2009-06-26 Thread Carlos Santana
Can a lustre file system have more than one MGS? Isn't it only one per
site? I saw some examples where target type mgs was mentioned during
mkfs.lustre for MDS and OSS nodes. Is it correct and when is it used?
Any inputs?
~
Thanks,
CS.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] failover mode

2009-06-25 Thread Carlos Santana
On Thu, Jun 25, 2009 at 9:45 AM, Amita Bhatkhande wrote:
> How do I handle MDT and OSS failures (not MDS and OST)? Is there any
> failover mode for them?
>
I started my journey with lustre a week ago, so I am not sure. But OSS
failures can be handled using failover mode. MDT can be (should be)
backed up regularly.

-
CS.

> Also, is it necessary to have each OST node to be a RAID device?
>
> Thanks,
> Amita.
>
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] OST redundancy between nodes?

2009-06-25 Thread Carlos Santana
On Fri, Jun 19, 2009 at 1:15 PM, Kevin Van Maren wrote:
> Gary Gogick wrote:
>> Heya all,
>>
>> I'm investigating potential solutions for a storage deployment.
>> Lustre piqued my interest due to ease of scalability and awesome
>> aggregate throughput potential.
>>
>> Wondering if there's any provision in Lustre for handling catastrophic
>> loss of a node containing an OST; eg. replication/mirroring of OSTs to
>> other nodes?

I am confused about this. Will the files in that OST be unavailable or
some of the files in that filesystem be unavailable?
My impression is that lustre would stripe file data across many OSTs
in terms of objects. So wouldn't failure of one OST will potentially
corrupt the files which have stripes/objects stored over that OST?

Please correct me if I am wrong.

-
CS.

>>
>> I'm gathering from the 1.8.0 documentation that there's no protection
>> of this sort for data other than underlying RAID configs on any
>> individual node, at least not without attempting to do some
>> interesting stuff with DRDB.  Just started looking at Lustre over the
>> past day though, so I'd totally appreciate an authoritative answer in
>> case I'm misinterpreting the documentation. :)
>
> Correct.
>
> Lustre failover can be used to support catastrophic failure of a _node_,
> but not the _storage_.  If your configuration makes LUNs available to
> two nodes, it is possible to configure Lustre to operate across the
> failure of a server.
>
> If your LUN fails catastrophically, all the data on that lun is gone.
> It is possible to bring Lustre up without it, but none of the files on
> that OST would be available.  If you are concerned about this case, then
> backups are your friend.
>
> While drdb could be used to make a lun "available" to two nodes, it will
> have a significant impact on performance, and (AFAIK) does not do
> synchronous replication, so an fsck would be required prior to mounting
> the OST on the second node, and there would be some data loss.
>
> Kevin
>
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] RAID stripe width

2009-06-24 Thread Carlos Santana
Hi,

What is stripe width? The lustre manual says,   =
 * (  -  ) <=1 MB. I am completely
new to this field and I was looking for some RAID documentation
online. Some people have mentioned that stripe width is the number of
parallel stripes that can be written to or read from simultaneously -
this equals number of disks in an array. Some thing is wrong with my
understanding or interpretation of stripe width. Any elaboration on
this and also on MTF and repair time calculation given in the manual?
Any good guide on RAID?

Thanks,
CS.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MDS - RAID0 or RAID1

2009-06-24 Thread Carlos Santana
On Tue, Jun 23, 2009 at 4:36 PM, Andreas Dilger wrote:
> On Jun 23, 2009  15:14 -0500, Carlos Santana wrote:
>> What RAID level should be used for MDS configuration? The manual seems
>> to suggest RAID1, however, the FAQ suggests RAID0 level.
>> http://manual.lustre.org/manual/LustreManual16_HTML/RAID.html#50401396_pgfId-1290495
>> http://wiki.lustre.org/index.php/Lustre_FAQ#What_is_the_typical_MDS_node_configuration.3F
>
> The MDT should always use RAID-1.  If you need more space than will fit
> on one disk, or you need more IOPS than can be handled by a single disk,
> then create sufficient RAID-1 pairs for your needs and then stripe them
> with RAID-0.  Having RAID-1 in pairs at the lower level minimizes the
> chance that multiple disk failures will cause data loss.
>
> I've clarified the FAQ in this regard.
>
>> Also, is it possible to add more OSTs as the requirement grows?
>
> Yes, adding new OSTs to an existing filesystem is a fairly common
> procedure.  Note if you are planning on expanding your filesystem
> in this way it makes sense to configure the MDT to have enough
> space/inodes for the OSTs that will be added in the future.
>
>> If yes, can additional resources be made part of RAID or can we
>> turn existing non-RAID system into a RAID system? Any insights?
>
> I'm not sure I understand this question.  OSTs should always use
> RAID to avoid data loss, though RAID-5 or RAID-6 is OK for OSTs.

My initial test setup does not have any RAID setup. I was planning to
further explore this into RAID setup. So I was curious abt it.
Thanks for the info..

-
CS.

>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] recovery status errors

2009-06-22 Thread Carlos Santana
Hello,

The lustre server is giving following errors related to recovery mode.
What could be the cause and solution for this? I remember rebooting my
server without unmounting OSS and MDS nodes though.

Logs:

Jun 20 01:53:14 localhost kernel: Lustre:
5771:0:(mds_fs.c:674:mds_init_server_data()) RECOVERY: service
lustre-MDT, 1 recoverable clients, 0 delayed clients, last_transno
34359738368
Jun 20 01:53:14 localhost kernel: Lustre: MDT lustre-MDT now
serving lustre-MDT_UUID
(lustre-MDT/e6123a33-d80d-bf1b-490b-49893680fa58), but will be in
recovery for at least 5:00, or until 1 client reconnect. During this
time new clients will not be allowed to connect. Recovery progress can
be monitored by watching
/proc/fs/lustre/mds/lustre-MDT/recovery_status.
Jun 20 01:53:14 localhost kernel: Lustre:
5771:0:(lproc_mds.c:271:lprocfs_wr_group_upcall()) lustre-MDT:
group upcall set to /usr/sbin/l_getgroups
Jun 20 01:53:14 localhost kernel: Lustre: lustre-MDT.mdt: set
parameter group_upcall=/usr/sbin/l_getgroups
Jun 20 01:53:14 localhost kernel: Lustre: Server lustre-MDT on
device /dev/loop5 has started
Jun 20 01:53:19 localhost kernel: Lustre: Request
x18446744071689995159 sent from lustre-OST-osc to NID 0...@lo 5s ago
has timed out (limit 5s).
Jun 20 01:53:28 localhost kernel: Lustre: lustre-MDT: temporarily
refusing client connection from 10.0.0...@tcp
Jun 20 01:53:28 localhost kernel: LustreError:
5764:0:(ldlm_lib.c:1826:target_send_reply_msg()) @@@ processing error
(-11)  r...@c8bc5400 x-1300233114/t0 o38->@:0/0 lens 368/0 e 0 to
0 dl 1245480908 ref 1 fl Interpret:/0/0 rc -11/0

-

Lustre: 2057:0:(filter.c:999:filter_init_server_data()) RECOVERY:
service lustre-OST, 1 recoverable clients, 0 delayed clients,
last_rcvd 47244640256
Lustre: OST lustre-OST now serving dev
(lustre-OST/072355c9-f254-9af8-4c05-bce872c287bf), but will be in
recovery for at least 5:00, or until 1 client reconnect. During this
time new clients will not be allowed to connect. Recovery progress can
be monitored by watching
/proc/fs/lustre/obdfilter/lustre-OST/recovery_status.
Lustre: Server lustre-OST on device /dev/loop1 has started
Lustre: 1971:0:(import.c:508:import_select_connection())
lustre-OST-osc: tried all connections, increasing latency to 5s
Lustre: 2048:0:(ldlm_lib.c:1333:check_and_start_recovery_timer())
lustre-OST: starting recovery timer
LustreError: 2048:0:(ldlm_lib.c:884:target_handle_connect())
lustre-OST: denying connection for new client 0...@lo
(lustre-mdtlov_UUID): 1 clients in recovery for 300s
LustreError: 2048:0:(ldlm_lib.c:1826:target_send_reply_msg()) @@@
processing error (-16)  r...@ce5c5e00 x464519181/t0 o8->@:0/0
lens 368/264 e 0 to 0 dl 1245622846 ref 1 fl Interpret:/0/0 rc -16/0
LustreError: 11-0: an error occurred while communicating with 0...@lo.
The ost_connect operation failed with -16

-

Lustre: 2057:0:(filter.c:999:filter_init_server_data()) RECOVERY:
service lustre-OST, 1 recoverable clients, 0 delayed clients,
last_rcvd 47244640256
Lustre: OST lustre-OST now serving dev
(lustre-OST/072355c9-f254-9af8-4c05-bce872c287bf), but will be in
recovery for at least 5:00, or until 1 client reconnect. During this
time new clients will not be allowed to connect. Recovery progress can
be monitored by watching
/proc/fs/lustre/obdfilter/lustre-OST/recovery_status.
Lustre: Server lustre-OST on device /dev/loop1 has started
Lustre: 1971:0:(import.c:508:import_select_connection())
lustre-OST-osc: tried all connections, increasing latency to 5s
Lustre: 2048:0:(ldlm_lib.c:1333:check_and_start_recovery_timer())
lustre-OST: starting recovery timer
LustreError: 2048:0:(ldlm_lib.c:884:target_handle_connect())
lustre-OST: denying connection for new client 0...@lo
(lustre-mdtlov_UUID): 1 clients in recovery for 300s
LustreError: 2048:0:(ldlm_lib.c:1826:target_send_reply_msg()) @@@
processing error (-16)  r...@ce5c5e00 x464519181/t0 o8->@:0/0
lens 368/264 e 0 to 0 dl 1245622846 ref 1 fl Interpret:/0/0 rc -16/0
LustreError: 11-0: an error occurred while communicating with 0...@lo.
The ost_connect operation failed with -16
Lustre: 1971:0:(import.c:508:import_select_connection())
lustre-OST-osc: tried all connections, increasing latency to 10s
LustreError: 2049:0:(ldlm_lib.c:884:target_handle_connect())
lustre-OST: denying connection for new client 0...@lo
(lustre-mdtlov_UUID): 1 clients in recovery for 250s
LustreError: 2049:0:(ldlm_lib.c:1826:target_send_reply_msg()) @@@
processing error (-16)  r...@c9a0d600 x464519184/t0 o8->@:0/0
lens 368/264 e 0 to 0 dl 1245622896 ref 1 fl Interpret:/0/0 rc -16/0

-


Following are the log messages in detail:
http://www.heypasteit.com/clip/92X
http://www.heypasteit.com/clip/92Y
http://www.heypasteit.com/clip/92Z

Any clues?

Thanks,
CS.
___
Lustre-discuss mailin

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-19 Thread Carlos Santana
Guys,

Thanks a lot for all the help..
I was able to build a patchless client from source. The basic verification
tests (unix commands) were successful.

I had an issue with latest CentOS kernel - 2.6.18-128.el5 though. Since I
started with minimum install (withou gcc) and then installed gcc thru yum,
which had dependency on kernel-headers package. By default CentOS 5.2
selects package from updates repo. So one may end up with 2.6.18-92.el5 for
kernel and 2.6.18-128.el5 for kernel-headers. I also tried building it
against latest 2.6.18-128.el5 kernel, however it had an issue as  pointed
out here:
http://lists.lustre.org/pipermail/lustre-discuss/2009-May/010560.html (bug
fixed: https://bugzilla.lustre.org/show_bug.cgi?id=19024 ).

Thank you everone.
Excited to get started with lustre..

-
CS.


On Wed, Jun 17, 2009 at 7:36 PM, Arden Wiebe  wrote:

>
> Carlos:
>
> This client of mine works. Matter of fact on all my clients it works.
>
> [r...@lustreone]# rpm -qa | grep -i lustre
> lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp
> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0
>
> Otherwise your output for the same command lists only 2 packages installed
> so you are missing some packages - those being the client packages if you
> don't want to use the patched kernel method of making a client as I have
> done above.  If you issue the rpm commands I mentioned in the very first
> response of this thread you will have a working client.
>
> Arden
>
> --- On Wed, 6/17/09, Carlos Santana  wrote:
>
> > From: Carlos Santana 
> > Subject: Re: [Lustre-discuss] Lustre installation and configuration
> problems
> > To: "Jerome, Ron" 
> > Cc: lustre-discuss@lists.lustre.org
> > Date: Wednesday, June 17, 2009, 5:10 PM
> > Folks,
> >
> > It been unsuccessful till now..
> >
> > I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5).
> > Later, I
> > updated kernel to 2.6.18-92.1.17 version. Here is a output
> > from uname
> > and rpm query:
> >
> > [r...@localhost ~]# rpm -qa | grep lustre
> > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> > [r...@localhost ~]# uname -a
> > Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue
> > Nov 4
> > 13:45:01 EST 2008 i686 i686 i386 GNU/Linux
> >
> > Other details:
> > --- --- ---
> > [r...@localhost ~]# ls -l /lib/modules | grep 2.6
> > drwxr-xr-x 6 root root 4096 Jun 17 18:47
> > 2.6.18-92.1.17.el5
> > drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5
> >
> >
> > [r...@localhost modules]# find . | grep lustre
> > ./2.6.18-92.1.17.el5/kernel/net/lustre
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko
> > --- --- ---
> >
> >
> > I am still having same problem. I seriously doubt, am I
> > missing anything?
> > I also tried a source install for 'patchless client',
> > however I have
> > been consistent in its results too.
> >
> > Are there any configuration steps needed after rpm (or
> > source)
> > installation? The one that I know of is restricting
> > interfaces in
> > modeprobe.conf, however I have tried it on-n-off with no
> > success.
> > Could anyone please suggest any debugging and tests for the
> > same? How
> > can I provide you more valuable output to help me? Any
> > insights?
> >
> > Also, I have a suggestion here. It might be good idea to
> > check for
> > 'uname -r' check in RPM installation to check for matching
> > kernel
> > version and if not suggest for source install.
> >
> > Thanks for the help. I really

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-17 Thread Carlos Santana
Folks,

It been unsuccessful till now..

I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5). Later, I
updated kernel to 2.6.18-92.1.17 version. Here is a output from uname
and rpm query:

[r...@localhost ~]# rpm -qa | grep lustre
lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
[r...@localhost ~]# uname -a
Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue Nov 4
13:45:01 EST 2008 i686 i686 i386 GNU/Linux

Other details:
--- --- ---
[r...@localhost ~]# ls -l /lib/modules | grep 2.6
drwxr-xr-x 6 root root 4096 Jun 17 18:47 2.6.18-92.1.17.el5
drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5


[r...@localhost modules]# find . | grep lustre
./2.6.18-92.1.17.el5/kernel/net/lustre
./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko
./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko
./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko
./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko
./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre
./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko
./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko
--- --- ---


I am still having same problem. I seriously doubt, am I missing anything?
I also tried a source install for 'patchless client', however I have
been consistent in its results too.

Are there any configuration steps needed after rpm (or source)
installation? The one that I know of is restricting interfaces in
modeprobe.conf, however I have tried it on-n-off with no success.
Could anyone please suggest any debugging and tests for the same? How
can I provide you more valuable output to help me? Any insights?

Also, I have a suggestion here. It might be good idea to check for
'uname -r' check in RPM installation to check for matching kernel
version and if not suggest for source install.

Thanks for the help. I really appreciate your patience..

-
Thanks,
CS.


On Wed, Jun 17, 2009 at 10:40 AM, Jerome, Ron wrote:
> I think the problem you have, as Cliff alluded to, is a mismatch between
> your kernel version  and the Luster kernel version modules.
>
>
>
> You have kernel “2.6.18-92.el5” and are installing Lustre
> “2.6.18_92.1.17.el5”   Note the “.1.17” is significant as the modules will
> end up in the wrong directory.  There is an update to CentOS to bring the
> kernel to the matching 2.6.18_92.1.17.el5 version you can pull it off the
> CentOS mirror site in the updates directory.
>
>
>
>
>
> Ron.
>
>
>
> From: lustre-discuss-boun...@lists.lustre.org
> [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Carlos Santana
> Sent: June 17, 2009 11:21 AM
> To: lustre-discuss@lists.lustre.org
> Subject: Re: [Lustre-discuss] Lustre installation and configuration problems
>
>
>
> And is there any specific installation order for patchless client? Could
> someone please share it with me?
>
> -
> CS.
>
> On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana  wrote:
>
> Huh... :( Sorry to bug you guys again...
>
> I am planning to make a fresh start now as nothing seems to have worked for
> me. If you have any comments/feedback please share them.
>
> I would like to confirm installation order before I make a fresh start. From
> Arden's experience:
> http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , the
> lusre-module is installed last. As I was installing Lustre 1.8, I was
> referring 1.8 operations manual
> http://manual.lustre.org/index.php?title=Main_Page . The installation order
> in the manual is different than what Arden has suggested.
>
> Will it make a difference in configuration at later stage? Which one should
> I follow now?
> Any comments?
>
> Thanks,
> CS.
>
>
>
> On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana  wrote:
>
> Thanks Cliff.
>
> The depmod -a was successful before as well. I am using CentOS 5.2
> box. Following are the packages installed:
> [r...@localhost tmp]# rpm -qa | grep -i lustre
> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
>
> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
>
> [r...@localhost tmp]# uname -a
>
> Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47
> EDT 2008 i686 i686 i386 GNU/Linux
>
> And here is a output from strace for mount:
> http://www.heypasteit.com/clip/8WT
>
> Any further debugging hints?
>
> 

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-17 Thread Carlos Santana
And is there any specific installation order for patchless client? Could
someone please share it with me?

-
CS.

On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana  wrote:

> Huh... :( Sorry to bug you guys again...
>
> I am planning to make a fresh start now as nothing seems to have worked for
> me. If you have any comments/feedback please share them.
>
> I would like to confirm installation order before I make a fresh start.
> From Arden's experience:
> http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html ,
> the lusre-module is installed last. As I was installing Lustre 1.8, I was
> referring 1.8 operations manual
> http://manual.lustre.org/index.php?title=Main_Page . The installation
> order in the manual is different than what Arden has suggested.
>
> Will it make a difference in configuration at later stage? Which one should
> I follow now?
> Any comments?
>
> Thanks,
> CS.
>
>
> On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana  wrote:
>
>> Thanks Cliff.
>>
>> The depmod -a was successful before as well. I am using CentOS 5.2
>> box. Following are the packages installed:
>> [r...@localhost tmp]# rpm -qa | grep -i lustre
>> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
>> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
>>
>> [r...@localhost tmp]# uname -a
>> Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47
>> EDT 2008 i686 i686 i386 GNU/Linux
>>
>> And here is a output from strace for mount:
>> http://www.heypasteit.com/clip/8WT
>>
>> Any further debugging hints?
>>
>> Thanks,
>> CS.
>>
>> On 6/16/09, Cliff White  wrote:
>> > Carlos Santana wrote:
>> >> The '$ modprobe -l lustre*' did not show any module on a patchless
>> >> client. modprobe -v returns 'FATAL: Module lustre not found'.
>> >>
>> >> How do I install a patchless client?
>> >> I have tried lustre-client-modules and lustre-client-ver rpm packages
>> in
>> >> both sequences. Am I missing anything?
>> >>
>> >
>> > Make sure the lustre-client-modules package matches your running kernel.
>> > Run depmod -a to be sure
>> > cliffw
>> >
>> >> Thanks,
>> >> CS.
>> >>
>> >>
>> >>
>> >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White > >> <mailto:cliff.wh...@sun.com>> wrote:
>> >>
>> >> Carlos Santana wrote:
>> >>
>> >> The lctlt ping and 'net up' failed with the following messages:
>> >> --- ---
>> >> [r...@localhost ~]# lctl ping 10.0.0.42
>> >> opening /dev/lnet failed: No such device
>> >> hint: the kernel modules may not be loaded
>> >> failed to ping 10.0.0...@tcp: No such device
>> >>
>> >> [r...@localhost ~]# lctl network up
>> >> opening /dev/lnet failed: No such device
>> >> hint: the kernel modules may not be loaded
>> >> LNET configure error 19: No such device
>> >>
>> >>
>> >> Make sure modules are unloaded, then try modprobe -v.
>> >> Looks like you have lnet mis-configured, if your module options are
>> >> wrong, you will see an error during the modprobe.
>> >> cliffw
>> >>
>> >> --- ---
>> >>
>> >>
>> >> I tried lustre_rmmod and depmod commands and it did not return
>> >> any error messages. Any further clues? Reinstall patchless
>> >> client again?
>> >>
>> >> -
>> >> CS.
>> >>
>> >>
>> >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White
>> >> mailto:cliff.wh...@sun.com>
>> >> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>>>
>> wrote:
>> >>
>> >>Carlos Santana wrote:
>> >>
>> >>I was able to run lustre_rmmod and depmod successfully.
>> The
>> >>'$lctl list_nids' returned the server ip address and
>> >> interface
>> >>(tcp0).
>> >>
>> >>I tried to mount the file system on a remote client, but
>> it
>> >>failed with the following message.
>> >>--- ---
>> 

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-17 Thread Carlos Santana
Huh... :( Sorry to bug you guys again...

I am planning to make a fresh start now as nothing seems to have worked for
me. If you have any comments/feedback please share them.

I would like to confirm installation order before I make a fresh start. From
Arden's experience:
http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , the
lusre-module is installed last. As I was installing Lustre 1.8, I was
referring 1.8 operations manual
http://manual.lustre.org/index.php?title=Main_Page . The installation order
in the manual is different than what Arden has suggested.

Will it make a difference in configuration at later stage? Which one should
I follow now?
Any comments?

Thanks,
CS.


On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana  wrote:

> Thanks Cliff.
>
> The depmod -a was successful before as well. I am using CentOS 5.2
> box. Following are the packages installed:
> [r...@localhost tmp]# rpm -qa | grep -i lustre
> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
>
> [r...@localhost tmp]# uname -a
> Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47
> EDT 2008 i686 i686 i386 GNU/Linux
>
> And here is a output from strace for mount:
> http://www.heypasteit.com/clip/8WT
>
> Any further debugging hints?
>
> Thanks,
> CS.
>
> On 6/16/09, Cliff White  wrote:
> > Carlos Santana wrote:
> >> The '$ modprobe -l lustre*' did not show any module on a patchless
> >> client. modprobe -v returns 'FATAL: Module lustre not found'.
> >>
> >> How do I install a patchless client?
> >> I have tried lustre-client-modules and lustre-client-ver rpm packages in
> >> both sequences. Am I missing anything?
> >>
> >
> > Make sure the lustre-client-modules package matches your running kernel.
> > Run depmod -a to be sure
> > cliffw
> >
> >> Thanks,
> >> CS.
> >>
> >>
> >>
> >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White  >> <mailto:cliff.wh...@sun.com>> wrote:
> >>
> >> Carlos Santana wrote:
> >>
> >> The lctlt ping and 'net up' failed with the following messages:
> >> --- ---
> >> [r...@localhost ~]# lctl ping 10.0.0.42
> >> opening /dev/lnet failed: No such device
> >> hint: the kernel modules may not be loaded
> >> failed to ping 10.0.0...@tcp: No such device
> >>
> >> [r...@localhost ~]# lctl network up
> >> opening /dev/lnet failed: No such device
> >> hint: the kernel modules may not be loaded
> >> LNET configure error 19: No such device
> >>
> >>
> >> Make sure modules are unloaded, then try modprobe -v.
> >> Looks like you have lnet mis-configured, if your module options are
> >> wrong, you will see an error during the modprobe.
> >> cliffw
> >>
> >> --- ---
> >>
> >>
> >> I tried lustre_rmmod and depmod commands and it did not return
> >> any error messages. Any further clues? Reinstall patchless
> >> client again?
> >>
> >> -
> >> CS.
> >>
> >>
> >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White
> >> mailto:cliff.wh...@sun.com>
> >> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>>>
> wrote:
> >>
> >>Carlos Santana wrote:
> >>
> >>I was able to run lustre_rmmod and depmod successfully.
> The
> >>'$lctl list_nids' returned the server ip address and
> >> interface
> >>(tcp0).
> >>
> >>I tried to mount the file system on a remote client, but
> it
> >>failed with the following message.
> >>--- ---
> >>[r...@localhost ~]# mount -t lustre 10.0.0...@tcp0
> :/lustre
> >>/mnt/lustre
> >>mount.lustre: mount 10.0.0...@tcp0:/lustre at
> /mnt/lustre
> >>failed: No such device
> >>Are the lustre modules loaded?
> >>Check /etc/modprobe.conf and /proc/filesystems
> >>Note 'alias lustre llite' should be removed from
> >> modprobe.conf
> >>--- ---
> >>
> >>However, t

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Carlos Santana
Thanks Cliff.

The depmod -a was successful before as well. I am using CentOS 5.2
box. Following are the packages installed:
[r...@localhost tmp]# rpm -qa | grep -i lustre
lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp

[r...@localhost tmp]# uname -a
Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47
EDT 2008 i686 i686 i386 GNU/Linux

And here is a output from strace for mount: http://www.heypasteit.com/clip/8WT

Any further debugging hints?

Thanks,
CS.

On 6/16/09, Cliff White  wrote:
> Carlos Santana wrote:
>> The '$ modprobe -l lustre*' did not show any module on a patchless
>> client. modprobe -v returns 'FATAL: Module lustre not found'.
>>
>> How do I install a patchless client?
>> I have tried lustre-client-modules and lustre-client-ver rpm packages in
>> both sequences. Am I missing anything?
>>
>
> Make sure the lustre-client-modules package matches your running kernel.
> Run depmod -a to be sure
> cliffw
>
>> Thanks,
>> CS.
>>
>>
>>
>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White > <mailto:cliff.wh...@sun.com>> wrote:
>>
>> Carlos Santana wrote:
>>
>> The lctlt ping and 'net up' failed with the following messages:
>> --- ---
>> [r...@localhost ~]# lctl ping 10.0.0.42
>> opening /dev/lnet failed: No such device
>> hint: the kernel modules may not be loaded
>> failed to ping 10.0.0...@tcp: No such device
>>
>> [r...@localhost ~]# lctl network up
>> opening /dev/lnet failed: No such device
>> hint: the kernel modules may not be loaded
>> LNET configure error 19: No such device
>>
>>
>> Make sure modules are unloaded, then try modprobe -v.
>> Looks like you have lnet mis-configured, if your module options are
>> wrong, you will see an error during the modprobe.
>> cliffw
>>
>> --- ---
>>
>>
>> I tried lustre_rmmod and depmod commands and it did not return
>> any error messages. Any further clues? Reinstall patchless
>> client again?
>>
>> -
>> CS.
>>
>>
>> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White
>> mailto:cliff.wh...@sun.com>
>> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>>> wrote:
>>
>>Carlos Santana wrote:
>>
>>I was able to run lustre_rmmod and depmod successfully. The
>>'$lctl list_nids' returned the server ip address and
>> interface
>>(tcp0).
>>
>>I tried to mount the file system on a remote client, but it
>>failed with the following message.
>>--- ---
>>[r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre
>>/mnt/lustre
>>mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre
>>failed: No such device
>>Are the lustre modules loaded?
>>Check /etc/modprobe.conf and /proc/filesystems
>>Note 'alias lustre llite' should be removed from
>> modprobe.conf
>>--- ---
>>
>>However, the mounting is successful on a single node
>>configuration - with client on the same machine as MDS
>> and OST.
>>Any clues? Where to look for logs and debug messages?
>>
>>
>>Syslog || /var/log/messages is the normal place.
>>
>>You can use 'lctl ping' to verify that the client can reach
>> the server.
>>Usually in these cases, it's a network/name misconfiguration.
>>
>>Run 'tunefs.lustre --print' on your servers, and verify that
>> mgsnode=
>>is correct.
>>
>>cliffw
>>
>>
>>Thanks,
>>CS.
>>
>>
>>
>>
>>
>>On Tue, Jun 16, 2009 at 12:16 PM, Cliff White
>>mailto:cliff.wh...@sun.com>
>> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>>
>><mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>
>> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>>>> wrote:
>>
>>   Carlos Santana wrote:

Re: [Lustre-discuss] umount on client - device busy

2009-06-16 Thread Carlos Santana
The fuser does not give any process list - "Cannot stat /mnt/lustre: Cannot
send after transport endpoint shutdown"
-
Thanks,
CS.


On Tue, Jun 16, 2009 at 3:51 PM, Johann Lombardi  wrote:

> On Jun 16, 2009, at 9:17 PM, Carlos Santana wrote:
>
>> I am unable to unmount file system from the client. The (test)
>> installation is a single node type with MDS-OST-client running on same
>> machine. Following are the error messages with umount command (even with
>> -f):
>>
>> --- ---
>> umount: /mnt/lustre: device is busy
>> umount: /mnt/lustre: device is busy
>>
>> (device/resource busy with -f option)
>> --- ---
>>
>> What process might be accessing the device?
>>
>
> You can use lsof or fuser to get the list of processes using the
> filesystem.
>
> Cheers,
> Johann
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Carlos Santana
The '$ modprobe -l lustre*' did not show any module on a patchless client.
modprobe -v returns 'FATAL: Module lustre not found'.

How do I install a patchless client?
I have tried lustre-client-modules and lustre-client-ver rpm packages in
both sequences. Am I missing anything?

Thanks,
CS.



On Tue, Jun 16, 2009 at 2:28 PM, Cliff White  wrote:

> Carlos Santana wrote:
>
>> The lctlt ping and 'net up' failed with the following messages:
>> --- ---
>> [r...@localhost ~]# lctl ping 10.0.0.42
>> opening /dev/lnet failed: No such device
>> hint: the kernel modules may not be loaded
>> failed to ping 10.0.0...@tcp: No such device
>>
>> [r...@localhost ~]# lctl network up
>> opening /dev/lnet failed: No such device
>> hint: the kernel modules may not be loaded
>> LNET configure error 19: No such device
>>
>
> Make sure modules are unloaded, then try modprobe -v.
> Looks like you have lnet mis-configured, if your module options are wrong,
> you will see an error during the modprobe.
> cliffw
>
>  --- ---
>>
>> I tried lustre_rmmod and depmod commands and it did not return any error
>> messages. Any further clues? Reinstall patchless client again?
>>
>> -
>> CS.
>>
>>
>> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White > cliff.wh...@sun.com>> wrote:
>>
>>Carlos Santana wrote:
>>
>>I was able to run lustre_rmmod and depmod successfully. The
>>'$lctl list_nids' returned the server ip address and interface
>>(tcp0).
>>
>>I tried to mount the file system on a remote client, but it
>>failed with the following message.
>>--- ---
>>[r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre
>>/mnt/lustre
>>mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre
>>failed: No such device
>>Are the lustre modules loaded?
>>Check /etc/modprobe.conf and /proc/filesystems
>>Note 'alias lustre llite' should be removed from modprobe.conf
>>--- ---
>>
>>However, the mounting is successful on a single node
>>configuration - with client on the same machine as MDS and OST.
>>Any clues? Where to look for logs and debug messages?
>>
>>
>>Syslog || /var/log/messages is the normal place.
>>
>>You can use 'lctl ping' to verify that the client can reach the server.
>>Usually in these cases, it's a network/name misconfiguration.
>>
>>Run 'tunefs.lustre --print' on your servers, and verify that mgsnode=
>>is correct.
>>
>>cliffw
>>
>>
>>Thanks,
>>CS.
>>
>>
>>
>>
>>
>>On Tue, Jun 16, 2009 at 12:16 PM, Cliff White
>>mailto:cliff.wh...@sun.com>
>><mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>>> wrote:
>>
>>   Carlos Santana wrote:
>>
>>   Thanks Kevin..
>>
>>   Please read:
>>
>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529
>>
>>   Those instructions are identical for 1.6 and 1.8.
>>
>>   For current lustre, only two commands are used for
>> configuration.
>>   mkfs.lustre and mount.
>>
>>
>>   Usually when lustre_rmmod returns that error, you run it a
>> second
>>   time, and it will clear things. Unless you have live mounts or
>>   network connections.
>>
>>   cliffw
>>
>>
>>   I am referring to 1.8 manual, but I was also referring to
>>HowTo
>>   page on wiki which seems to be for 1.6. The HowTo page
>>
>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
>>   mentions abt lmc, lconf, and lctl.
>>
>>   The modules are installed in the right place. The '$
>>   lustre_rmmod' resulted in following o/p:
>>   [r...@localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]#
>>lustre_rmmod
>>   ERROR: Module obdfilter is in use
>>   ERROR: Module ost is in use
>>   ERROR: Module mds is in use
>>   ERROR: Module fsfilt_ldiskfs is in use
>>   ERROR: Module mgs is in use
>>   ERROR: Module mgc is in use by mgs
>>   ERROR: Module ldiskfs is in 

[Lustre-discuss] umount on client - device busy

2009-06-16 Thread Carlos Santana
I am unable to unmount file system from the client. The (test) installation
is a single node type with MDS-OST-client running on same machine. Following
are the error messages with umount command (even with -f):

--- ---
umount: /mnt/lustre: device is busy
umount: /mnt/lustre: device is busy

(device/resource busy with -f option)
--- ---

What process might be accessing the device? The 'ps -aux' showed me lot of
lustre related processes, so I am not sure which one should be killed (if
necessary). Any hints?

Thanks,
CS.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Carlos Santana
The lctlt ping and 'net up' failed with the following messages:
--- ---
[r...@localhost ~]# lctl ping 10.0.0.42
opening /dev/lnet failed: No such device
hint: the kernel modules may not be loaded
failed to ping 10.0.0...@tcp: No such device

[r...@localhost ~]# lctl network up
opening /dev/lnet failed: No such device
hint: the kernel modules may not be loaded
LNET configure error 19: No such device
--- ---

I tried lustre_rmmod and depmod commands and it did not return any error
messages. Any further clues? Reinstall patchless client again?

-
CS.


On Tue, Jun 16, 2009 at 1:32 PM, Cliff White  wrote:

> Carlos Santana wrote:
>
>> I was able to run lustre_rmmod and depmod successfully. The '$lctl
>> list_nids' returned the server ip address and interface (tcp0).
>>
>> I tried to mount the file system on a remote client, but it failed with
>> the following message.
>> --- ---
>> [r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre /mnt/lustre
>> mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre failed: No such
>> device
>> Are the lustre modules loaded?
>> Check /etc/modprobe.conf and /proc/filesystems
>> Note 'alias lustre llite' should be removed from modprobe.conf
>> --- ---
>>
>> However, the mounting is successful on a single node configuration - with
>> client on the same machine as MDS and OST.
>> Any clues? Where to look for logs and debug messages?
>>
>
> Syslog || /var/log/messages is the normal place.
>
> You can use 'lctl ping' to verify that the client can reach the server.
> Usually in these cases, it's a network/name misconfiguration.
>
> Run 'tunefs.lustre --print' on your servers, and verify that mgsnode=
> is correct.
>
> cliffw
>
>
>> Thanks,
>> CS.
>>
>>
>>
>>
>>
>> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White > cliff.wh...@sun.com>> wrote:
>>
>>Carlos Santana wrote:
>>
>>Thanks Kevin..
>>
>>Please read:
>>
>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529
>>
>>Those instructions are identical for 1.6 and 1.8.
>>
>>For current lustre, only two commands are used for configuration.
>>mkfs.lustre and mount.
>>
>>
>>Usually when lustre_rmmod returns that error, you run it a second
>>time, and it will clear things. Unless you have live mounts or
>>network connections.
>>
>>cliffw
>>
>>
>>I am referring to 1.8 manual, but I was also referring to HowTo
>>page on wiki which seems to be for 1.6. The HowTo page
>>
>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
>>mentions abt lmc, lconf, and lctl.
>>
>>The modules are installed in the right place. The '$
>>lustre_rmmod' resulted in following o/p:
>>[r...@localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod
>>ERROR: Module obdfilter is in use
>>ERROR: Module ost is in use
>>ERROR: Module mds is in use
>>ERROR: Module fsfilt_ldiskfs is in use
>>ERROR: Module mgs is in use
>>ERROR: Module mgc is in use by mgs
>>ERROR: Module ldiskfs is in use by fsfilt_ldiskfs
>>ERROR: Module lov is in use
>>ERROR: Module lquota is in use by obdfilter,mds
>>ERROR: Module osc is in use
>>ERROR: Module ksocklnd is in use
>>ERROR: Module ptlrpc is in use by
>>obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
>>ERROR: Module obdclass is in use by
>>obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
>>ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass
>>ERROR: Module lvfs is in use by
>>
>>  obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
>>ERROR: Module libcfs is in use by
>>
>>  
>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs
>>
>>Do I need to shutdown these services? How can I do that?
>>
>>Thanks,
>>CS.
>>
>>
>>On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren
>>mailto:kevin.vanma...@sun.com>
>><mailto:kevin.vanma...@sun.com <mailto:kevin.vanma...@sun.com>>>
>>wrote:
>>
>>   I think lconf and lmc went away with Lustre 1.6.  Are you
>>sure you
>>   are looking at t

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Carlos Santana
I was able to run lustre_rmmod and depmod successfully. The '$lctl
list_nids' returned the server ip address and interface (tcp0).

I tried to mount the file system on a remote client, but it failed with the
following message.
--- ---
[r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre /mnt/lustre
mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre failed: No such
device
Are the lustre modules loaded?
Check /etc/modprobe.conf and /proc/filesystems
Note 'alias lustre llite' should be removed from modprobe.conf
--- ---

However, the mounting is successful on a single node configuration - with
client on the same machine as MDS and OST.
Any clues? Where to look for logs and debug messages?

Thanks,
CS.




On Tue, Jun 16, 2009 at 12:16 PM, Cliff White  wrote:

> Carlos Santana wrote:
>
>> Thanks Kevin..
>>
>>  Please read:
>
> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529
>
> Those instructions are identical for 1.6 and 1.8.
>
> For current lustre, only two commands are used for configuration.
> mkfs.lustre and mount.
>
>
> Usually when lustre_rmmod returns that error, you run it a second time, and
> it will clear things. Unless you have live mounts or network connections.
>
> cliffw
>
>
>  I am referring to 1.8 manual, but I was also referring to HowTo page on
>> wiki which seems to be for 1.6. The HowTo page
>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Toolsmentions
>>  abt lmc, lconf, and lctl.
>>
>> The modules are installed in the right place. The '$ lustre_rmmod'
>> resulted in following o/p:
>> [r...@localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod
>> ERROR: Module obdfilter is in use
>> ERROR: Module ost is in use
>> ERROR: Module mds is in use
>> ERROR: Module fsfilt_ldiskfs is in use
>> ERROR: Module mgs is in use
>> ERROR: Module mgc is in use by mgs
>> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs
>> ERROR: Module lov is in use
>> ERROR: Module lquota is in use by obdfilter,mds
>> ERROR: Module osc is in use
>> ERROR: Module ksocklnd is in use
>> ERROR: Module ptlrpc is in use by obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
>> ERROR: Module obdclass is in use by
>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
>> ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass
>> ERROR: Module lvfs is in use by
>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
>> ERROR: Module libcfs is in use by
>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs
>>
>> Do I need to shutdown these services? How can I do that?
>>
>> Thanks,
>> CS.
>>
>>
>> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren 
>> > kevin.vanma...@sun.com>> wrote:
>>
>>I think lconf and lmc went away with Lustre 1.6.  Are you sure you
>>are looking at the 1.8 manual, and not directions for 1.4?
>>
>>/usr/sbin/lctl should be in the lustre- RPM.  Do a:
>># rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
>>
>>
>>Do make sure the modules are installed in the right place:
>># cd /lib/modules/`uname -r`
>># find . | grep lustre.ko
>>
>>If it shows up, then do:
>># lustre_rmmod
>># depmod
>>and try again.
>>
>>Otherwise, figure out where your modules are installed:
>># uname -r
>># cd /lib/modules
>># find . | grep lustre.ko
>>
>>
>>You can also double-check the NID.  On the MSD server, do
>># lctl list_nids
>>
>>Should show 10.0.0...@tcp0
>>
>>Kevin
>>
>>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Carlos Santana
Thanks Kevin..

I am referring to 1.8 manual, but I was also referring to HowTo page on wiki
which seems to be for 1.6. The HowTo page
http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Toolsmentions
abt lmc, lconf, and lctl.

The modules are installed in the right place. The '$ lustre_rmmod' resulted
in following o/p:
[r...@localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod
ERROR: Module obdfilter is in use
ERROR: Module ost is in use
ERROR: Module mds is in use
ERROR: Module fsfilt_ldiskfs is in use
ERROR: Module mgs is in use
ERROR: Module mgc is in use by mgs
ERROR: Module ldiskfs is in use by fsfilt_ldiskfs
ERROR: Module lov is in use
ERROR: Module lquota is in use by obdfilter,mds
ERROR: Module osc is in use
ERROR: Module ksocklnd is in use
ERROR: Module ptlrpc is in use by obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
ERROR: Module obdclass is in use by
obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass
ERROR: Module lvfs is in use by
obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
ERROR: Module libcfs is in use by
obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs

Do I need to shutdown these services? How can I do that?

Thanks,
CS.


On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren wrote:

> I think lconf and lmc went away with Lustre 1.6.  Are you sure you are
> looking at the 1.8 manual, and not directions for 1.4?
>
> /usr/sbin/lctl should be in the lustre- RPM.  Do a:
> # rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
>
>
> Do make sure the modules are installed in the right place:
> # cd /lib/modules/`uname -r`
> # find . | grep lustre.ko
>
> If it shows up, then do:
> # lustre_rmmod
> # depmod
> and try again.
>
> Otherwise, figure out where your modules are installed:
> # uname -r
> # cd /lib/modules
> # find . | grep lustre.ko
>
>
> You can also double-check the NID.  On the MSD server, do
> # lctl list_nids
>
> Should show 10.0.0...@tcp0
>
> Kevin
>
>
> Carlos Santana wrote:
>
>> Thanks for the update Sheila. I am using manual for Lustre 1.8 (May-09).
>>
>> Arden, as per the 1.8 manual:
>> --- ---
>> Install the kernel, modules and ldiskfs packages.
>> Use the rpm -ivh command to install the kernel, module and ldiskfs
>> packages. For example:
>> $ rpm -ivh kernel-lustre-smp- \
>> kernel-ib- \
>> lustre-modules- \
>> lustre-ldiskfs-
>> c. Install the utilities/userspace packages.
>> Use the rpm -ivh command to install the utilities packages. For example:
>> $ rpm -ivh lustre-
>> d. Install the e2fsprogs package.
>> Use the rpm -i command to install the e2fsprogs package. For example:
>> $ rpm -i e2fsprogs-
>> If you want to add any optional packages to your Lustre file system,
>> install them
>> now.
>> 4. Verify that the boot loader (grub.conf or lilo.conf) has
>> --- ---
>> I followed the same order.
>>
>>
>> The lconf and lmc are not available on my system. I am not sure what are
>> they and when will I need it. I continued to explore other things in lustre
>> and have created MDS and OST mount points on the same system. I have
>> installed lustre client on a separate machine and when I tried to mount
>> lustre MGS on it, I received following error:
>>
>> --- ---
>> [r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre /mnt/lustre
>> mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre failed: No such
>> device
>> Are the lustre modules loaded?
>> Check /etc/modprobe.conf and /proc/filesystems
>> Note 'alias lustre llite' should be removed from modprobe.conf
>> --- ---
>>
>>
>> The modprobe on client says, 'module lustre not found'. Any clues?
>>
>> Client: Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10
>> 18:49:47 EDT 2008 i686 i686 i386 GNU/Linux
>> MDS/OST: Linux localhost.localdomain 2.6.18-92.1.17.el5_lustre.1.8.0smp #1
>> SMP Wed Feb 18 18:40:54 MST 2009 i686 i686 i386 GNU/Linux
>>
>> Thanks,
>> CS.
>>
>>
>>
>> On Mon, Jun 15, 2009 at 5:16 PM, Arden Wiebe > albert...@yahoo.com>> wrote:
>>
>>
>>Carlos:
>>
>>I'm not clear on which kernel package you tried to install.  There
>>is pretty much a set order to install the packages from my
>>understanding of the wording in the manual.  From experience:
>>
>>rpm -ivh kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0.x86_64.rpm
>>rpm -ivh lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm
>>rpm -ivh
>>lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm
>>rpm -ivh
>>lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm
>>rpm -Uvh e2fsprogs-1.40.11.sun1-0redhat.rhel5.x86_64.rpm
>>
>>Hope that helps as that order has worked for me many times.
>>
>>Arden
>>
>>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Carlos Santana
Thanks for the update Sheila. I am using manual for Lustre 1.8 (May-09).

Arden, as per the 1.8 manual:
--- ---
Install the kernel, modules and ldiskfs packages.
Use the rpm -ivh command to install the kernel, module and ldiskfs
packages. For example:
$ rpm -ivh kernel-lustre-smp- \
kernel-ib- \
lustre-modules- \
lustre-ldiskfs-
c. Install the utilities/userspace packages.
Use the rpm -ivh command to install the utilities packages. For example:
$ rpm -ivh lustre-
d. Install the e2fsprogs package.
Use the rpm -i command to install the e2fsprogs package. For example:
$ rpm -i e2fsprogs-
If you want to add any optional packages to your Lustre file system, install
them
now.
4. Verify that the boot loader (grub.conf or lilo.conf) has
--- ---
I followed the same order.


The lconf and lmc are not available on my system. I am not sure what are
they and when will I need it. I continued to explore other things in lustre
and have created MDS and OST mount points on the same system. I have
installed lustre client on a separate machine and when I tried to mount
lustre MGS on it, I received following error:

--- ---
[r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre /mnt/lustre
mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre failed: No such
device
Are the lustre modules loaded?
Check /etc/modprobe.conf and /proc/filesystems
Note 'alias lustre llite' should be removed from modprobe.conf
--- ---


The modprobe on client says, 'module lustre not found'. Any clues?

Client: Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47
EDT 2008 i686 i686 i386 GNU/Linux
MDS/OST: Linux localhost.localdomain 2.6.18-92.1.17.el5_lustre.1.8.0smp #1
SMP Wed Feb 18 18:40:54 MST 2009 i686 i686 i386 GNU/Linux

Thanks,
CS.



On Mon, Jun 15, 2009 at 5:16 PM, Arden Wiebe  wrote:

>
> Carlos:
>
> I'm not clear on which kernel package you tried to install.  There is
> pretty much a set order to install the packages from my understanding of the
> wording in the manual.  From experience:
>
> rpm -ivh kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0.x86_64.rpm
> rpm -ivh lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm
> rpm -ivh lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm
> rpm -ivh lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm
> rpm -Uvh e2fsprogs-1.40.11.sun1-0redhat.rhel5.x86_64.rpm
>
> Hope that helps as that order has worked for me many times.
>
> Arden
>
>
> --- On Mon, 6/15/09, Carlos Santana  wrote:
>
> > From: Carlos Santana 
> > Subject: [Lustre-discuss] Lustre installation and configuration problems
> > To: lustre-discuss@lists.lustre.org
> > Date: Monday, June 15, 2009, 2:07 PM
> > Hello list,
> >
> > I am struggling to install Lustre 1.8 on a CentOS 5.2 box.
> > I am referring to Lustre manual
> > http://manual.lustre.org/index.php?title=Main_Page
> > and Lustre HowTo http://wiki.lustre.org/index.php/Lustre_Howto
> > guide. Following is the installation order and warning/error
> > messages (if any) associated with it.
> >
> >  - kernel-lustre patch
> >  - luster-module: http://www.heypasteit.com/clip/8UJ
> >
> >  - lustre-ldiskfs http://www.heypasteit.com/clip/8UK
> >
> >
> >  - lustre-utilities
> >  - e2fsprogs: http://www.heypasteit.com/clip/8UL
> >
> >
> > I did not see any test examples under
> > /usr/lib/lustre/examples directory as mentioned in the HowTo
> > document. In fact, I do not have 'examples' dir at
> > all. So I skipped to
> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
> > section. But I did not have lmc, lconf, and lctl commands
> > either. Any clues on how should I proceed with installation
> > and configuration? Is there any guide for step-by-step
> > installation? Feedback/comments welcome.
> >
> >
> > Thanks,
> > CS.
> >
> >
> > -Inline Attachment Follows-
> >
> > ___
> > Lustre-discuss mailing list
> > Lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >
>
>
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre installation and configuration problems

2009-06-15 Thread Carlos Santana
Hello list,

I am struggling to install Lustre 1.8 on a CentOS 5.2 box. I am referring to
Lustre manual  http://manual.lustre.org/index.php?title=Main_Page and Lustre
HowTo http://wiki.lustre.org/index.php/Lustre_Howto guide. Following is the
installation order and warning/error messages (if any) associated with it.
 - kernel-lustre patch
 - luster-module: http://www.heypasteit.com/clip/8UJ
 - lustre-ldiskfs http://www.heypasteit.com/clip/8UK
 - lustre-utilities
 - e2fsprogs: http://www.heypasteit.com/clip/8UL

I did not see any test examples under /usr/lib/lustre/examples directory as
mentioned in the HowTo document. In fact, I do not have 'examples' dir at
all. So I skipped to
http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Toolssection.
But I did not have lmc, lconf, and lctl commands either. Any clues
on how should I proceed with installation and configuration? Is there any
guide for step-by-step installation? Feedback/comments welcome.

Thanks,
CS.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] lustre and OS file system

2009-06-15 Thread Carlos Santana
I am a newbie to the lustre world. I have a very basic and probably stupid
question here. When we install a Lustre FS, i.e., install RPM packages for
the server side then what happens to the operating system's file system?
When and how does lustre come into the picture? Comments appreciated.

-
Neil.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss