[Lustre-discuss] question about dcache revalidate

2012-01-11 Thread tao.peng
Hi,

I was reading dcache.c and following comments in ll_revalidate_it() seem 
confusing. Does it mean llite can hash a positive dentry to dcache without 
taking inode LOOKUP lock?

589 /*
590  * This part is here to combat evil-evil race in real_lookup on 2.6
591  * kernels.  The race details are: We enter do_lookup() looking for 
some
592  * name, there is nothing in dcache for this name yet and d_lookup()
593  * returns NULL.  We proceed to real_lookup(), and while we do this,
594  * another process does open on the same file we looking up (most 
simple
595  * reproducer), open succeeds and the dentry is added. Now back to
596  * us. In real_lookup() we do d_lookup() again and suddenly find the
597  * dentry, so we call d_revalidate on it, but there is no lock, so
598  * without this code we would return 0, but unpatched real_lookup 
just
599  * returns -ENOENT in such a case instead of retrying the lookup. 
Once
600  * this is dealt with in real_lookup(), all of this ugly mess can 
go and
601  * we can just check locks in -d_revalidate without doing any RPCs
602  * ever.
603  */

Best Regards,
Tao


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Need Help

2012-01-11 Thread Colin Faber
Hi,

Additional logging from the MDS and OSS's is required to really tell 
whats going on, that said you can try and verify that your OSS nodes can 
successfully contact your MDS and MGS nodes, lctl ping will indicate 
this. After that if you find they are successfully contacting each other 
you can try and abort recovery both on the MDT and OST's you're 
attempting to mount. (-o abort_recov mount option).

-cf


On 01/09/2012 04:00 AM, Patrice Hamelin wrote:
 Hi,

   I am getting that occasionnally and try to remount another time, 
 which works.  I am interested in finding out what's happenning too.

 Thanks.

 On 01/07/12 07:19, Ashok nulguda wrote:
 Dear All,

 We have Lustre 1.8.4 installed with 2 MDS servers and 2 OSS servers 
 with 17 OSTes and 1 MDT with ha configured on both my MDS and OSS.
 problem:-
 Some of my OSTes are not mounting on my OSS servers.
 When i try to maunully mount it  through errors  failed: Transport 
 endpoint is not connected
 commnd :-mount -t lustre /dev/mapper/..   /OST1
  failed: Transport endpoint is not connected

 however, when we login and check MDS server for lustre ost status we 
 found
 cat /proc/fs/lustre/mds/lustre-MDT/recovery_status
 It shows completed
 And also
 cat /proc/fs/lustre/devices
 All my mdt and ost are showing up status.

 Can anyone help us it debuging.


 Thanks and Regards
 Ashok

 -- 
 *Ashok Nulguda
 *
 *TATA ELXSI LTD*
 *Mb : +91 9689945767
 Mb : +91 9637095767
 Land line : 2702044871
 *
 *Email :ash...@tataelxsi.co.in mailto:tshrik...@tataelxsi.co.in*


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 -- 
 Patrice Hamelin
 Specialiste sénior en systèmes d'exploitation | Senior OS specialist
 Environnement Canada | Environment Canada
 2121, route Transcanadienne | 2121 Transcanada Highway
 Dorval, QC H9P 1J3
 Téléphone | Telephone 514-421-5303
 Télécopieur | Facsimile 514-421-7231
 Gouvernement du Canada | Government of Canada


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Finding bugs in Lustre with Coccinelle

2012-01-11 Thread Andreas Dilger
On 2012-01-09, at 4:33 AM, Karsten Weiss wrote:
 I've compiled(^1) Lustre 2.1.0 on CentOS 6.2 with Clang's static analyzer 
 (LLVM 3.0). Here's the bug summary to give you an idea of the result:
 
 Bug TypeQuantity
 
 All Bugs 594
 
 Dead code
 Idempotent operation 11
 
 Dead store
 Dead assignment  76
 Dead increment3
 Dead initialization   3
 
 Logic error
 Assigned value is garbage or undefined9
 Called function pointer is null (null dereference)   27   
 Dereference of null pointer 456
 Dereference of undefined pointer value1
 Division by zero  3
 Function call argument is an uninitialized value  4   
 Garbage return value  1
 
 You can download the full result (annotated source code) here:
 
 http://dl.dropbox.com/u/1868416/lustre-2.1.0-scan-build.tar.bz2
 (I will delete this file in a couple of days)

Karsten,
can you please split this tarball into a couple of smaller parts and
attach it to http://bugs.whamcloud.com/browse/LU-871, the bug that I
previously opened to track defects found by Clang/LLVM.

The attachment size limit is 10MB, but splitting into logical parts by
the defect type would be ideal.

 To view the result extract the archive and point your web browser at:
 
 lustre-2.1.0-scan-build/2012-01-09-1/index.html
 
 Cheers,
 Karsten
 
 ^1: Here's what I used:
 
 touch META
 sh autogen.sh 
 scan-build ./configure --disable-server \
--with-linux=/usr/src/kernels/2.6.32-220.2.1.el6.x86_64 \
--with-linux-obj=/lib/modules/2.6.32-220.2.1.el6.x86_64/build/ \
--with-downstream-release=wc1
 mkdir ./lustre-2.1.0-scan-build
 scan-build -o ./lustre-2.1.0-scan-build/ make -j 24
 
 -- 
 ___creating IT solutions
 Dipl.-Inf. Karsten Weissscience + computing ag
 phone:+49 7071 9457 452 Hagellocher Weg 73
 teamline: +49 7071 9457 681 72070 Tuebingen
 email:k.we...@science-computing.de  www.science-computing.de
 -- 
 Vorstand/Board of Management:
 Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
 Dr. Arno Steitz, Dr. Ingrid Zech
 Vorsitzender des Aufsichtsrats/
 Chairman of the Supervisory Board:
 Philippe Miltin
 Sitz/Registered Office: Tuebingen
 Registergericht/Registration Court: Stuttgart
 Registernummer/Commercial Register No.: HRB 382196 
 
 


Cheers, Andreas
--
Andreas Dilger   Whamcloud, Inc.
Principal Engineer   http://www.whamcloud.com/




___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss