Re: [Lustre-discuss] LBUG encountered in 1.8.0

2009-08-05 Thread Isaac Huang
On Fri, Jul 31, 2009 at 10:52:46AM -0600, Daniel Kulinski wrote: >Unmounting lustre when our heartbeat software was misconfigured (IPMI >password changed). > > >tx1oss3-clusternet kernel: LustreError: >19350:0:(quota_context.c:1369:lqs_exit()) >ASSERTION(atomic_read(&q->lqs_re

[Lustre-discuss] Moving away from bugzilla

2009-08-05 Thread Mag Gam
Are there any plans to move away from Bugzilla for issue tracking? I have been lurking around https://bugzilla.lustre.org for several months now and I still find it very hard to use, do others have the same feeling? or is there a setting or a preferred filter to see all the new bugs in 1.8 series?

Re: [Lustre-discuss] Problems upgrading from 1.6 to 1.8

2009-08-05 Thread Mag Gam
Were you able to fix this? On Fri, Jul 17, 2009 at 11:38 AM, Christopher J.Walker wrote: > In order to avoid occasional crashes on our 1.6.4.3 OSSs, we have just > upgraded our MDS and OSSs from 1.6.4.3 to lustre 1.8.0.1. Unfortunately, > we are having problems writing files  - we've tried from

[Lustre-discuss] size of OST

2009-08-05 Thread Mag Gam
I know the largest possible OST is 8TB, but is that a recommended size? I wan to avoid maintaining many objects therefore I was thinking of creating 10x8TB OSTs on 10 OSS. Was wondering what kind of problems can arise. TIA ___ Lustre-discuss mailing list

Re: [Lustre-discuss] Moving away from bugzilla

2009-08-05 Thread Brian J. Murrell
On Wed, 2009-08-05 at 06:22 -0400, Mag Gam wrote: > Are there any plans to move away from Bugzilla for issue tracking? I am not aware of any plans in this regard. I personally have not been introduced to a bug tracker that is as complete as bugzilla for what we need to do and I am quite familiar

Re: [Lustre-discuss] size of OST

2009-08-05 Thread Brian J. Murrell
On Wed, 2009-08-05 at 07:20 -0400, Mag Gam wrote: > I know the largest possible OST is 8TB, but is that a recommended > size? Sure, if it meets your needs. > I wan to avoid maintaining many objects therefore I was thinking > of creating 10x8TB OSTs on 10 OSS. Was wondering what kind of problems >

Re: [Lustre-discuss] How to estimate the time for e2fsck on OST

2009-08-05 Thread Andreas Dilger
On Aug 04, 2009 22:47 +, Peter Grandi wrote: > Andreas Dilger wrote: > adilger> Putting 4 OSTs on a single disk doesn't make sense. > adilger> A single OST can be up to 8TB, and if you have multiple > adilger> OSTs on the same disk(s) it will cause terrible > adilger> performance problems due

Re: [Lustre-discuss] Lustre v1.8.0.1 slower than expected large-file, sequential-buffered-file-read speed

2009-08-05 Thread Rick Rothstein
Hi Andreas - Thanks for the advice. I will gather additional CPU stats and see what shows up. However, CPU does not seem to be a factor in the slower than expected large file buffered I/O reads. My machines have dual quad 2.66ghz processors, and gross CPU usage hovers around 50% when I'm running

Re: [Lustre-discuss] Problems upgrading from 1.6 to 1.8

2009-08-05 Thread Christopher J.Walker
Mag Gam wrote: > Were you able to fix this? > The MGS errors, we did fix. What we did was follow the procedure for a minor upgrade in the lustre manual - unmount all the clients, unmount all the osts then tunefs.lustre -writeconf on the MGS/MDT, remount that, then tunefs.lustre -writeconf on

[Lustre-discuss] Inode errors at time of job failure

2009-08-05 Thread Daniel Kulinski
What would cause the following error to appear? LustreError: 10991:0:(file.c:2930:ll_inode_revalidate_fini()) failure -2 inode 14520180 This happened at the same time a job failed. Error number 2 is ENOENT which means that this inode does not exist? Is there a way to query the MDS to

Re: [Lustre-discuss] Moving away from bugzilla

2009-08-05 Thread Andreas Dilger
On Aug 05, 2009 09:51 -0400, Brian J. Murrell wrote: > On Wed, 2009-08-05 at 06:22 -0400, Mag Gam wrote: > > or is there a setting or a preferred filter to see all > > the new bugs in 1.8 series? > > Find and follow the release trackers. Release tracking bugs usually > have an alias of the form

Re: [Lustre-discuss] Moving away from bugzilla

2009-08-05 Thread Brian J. Murrell
On Wed, 2009-08-05 at 13:16 -0600, Andreas Dilger wrote: > > What brian meant was "181-tracking" or "182-tracking", etc. Oops. Yes, indeed. A little lysdexic this morning I guess. Sorry for the mis-information. b. signature.asc Description: This is a digitally signed message part _

Re: [Lustre-discuss] size of OST

2009-08-05 Thread Andreas Dilger
On Aug 05, 2009 07:20 -0400, Mag Gam wrote: > I know the largest possible OST is 8TB, but is that a recommended > size? I wan to avoid maintaining many objects therefore I was thinking > of creating 10x8TB OSTs on 10 OSS. Was wondering what kind of problems > can arise. There are many customers r

Re: [Lustre-discuss] Problems upgrading from 1.6 to 1.8

2009-08-05 Thread Andreas Dilger
On Aug 05, 2009 18:45 +0100, Christopher J.Walker wrote: > Aug 5 13:53:01 se02 kernel: LustreError: > 2668:0:(lib-move.c:95:lnet_try_match_md()) Matching packet from > 12345-10.1.4@tcp, match 1449 length 832 too big: 816 left, 816 allowed This looks like bug 20020, fixed in the 1.8.1 relea

Re: [Lustre-discuss] Moving away from bugzilla

2009-08-05 Thread Christopher J. Morrone
Mag Gam wrote: > Are there any plans to move away from Bugzilla for issue tracking? I > have been lurking around https://*bugzilla.lustre.org for several > months now and I still find it very hard to use, do others have the > same feeling? or is there a setting or a preferred filter to see all > th

Re: [Lustre-discuss] Inode errors at time of job failure

2009-08-05 Thread Oleg Drokin
Hello! On Aug 5, 2009, at 3:12 PM, Daniel Kulinski wrote: > What would cause the following error to appear? Typically this is some sort of a race where you presume an inode exist (because you have some traces of it in memory), but it is not anymore (on mds, anyway). So when client comes to fet

Re: [Lustre-discuss] Problems upgrading from 1.6 to 1.8

2009-08-05 Thread Mag Gam
Thanks for the response Chris. On Wed, Aug 5, 2009 at 5:20 PM, Andreas Dilger wrote: > On Aug 05, 2009  18:45 +0100, Christopher J.Walker wrote: >> Aug  5 13:53:01 se02 kernel: LustreError: >> 2668:0:(lib-move.c:95:lnet_try_match_md()) Matching packet from >> 12345-10.1.4@tcp, match 1449 len

Re: [Lustre-discuss] Lustre v1.8.0.1 slower than expected large-file, sequential-buffered-file-read speed

2009-08-05 Thread Andreas Dilger
On Aug 05, 2009 13:30 -0400, Rick Rothstein wrote: > My machines have dual quad 2.66ghz processors, > and gross CPU usage hovers around 50% > when I'm running 16 "dd" read jobs. Be cautious of nice round numbers for CPU usage. Sometimes this means that 1 CPU is 100% busy, and another is 0% busy.