[Lustre-discuss] question about dcache revalidate
Hi, I was reading dcache.c and following comments in ll_revalidate_it() seem confusing. Does it mean llite can hash a positive dentry to dcache without taking inode LOOKUP lock? 589 /* 590 * This part is here to combat evil-evil race in real_lookup on 2.6 591 * kernels. The race details are: We enter do_lookup() looking for some 592 * name, there is nothing in dcache for this name yet and d_lookup() 593 * returns NULL. We proceed to real_lookup(), and while we do this, 594 * another process does open on the same file we looking up (most simple 595 * reproducer), open succeeds and the dentry is added. Now back to 596 * us. In real_lookup() we do d_lookup() again and suddenly find the 597 * dentry, so we call d_revalidate on it, but there is no lock, so 598 * without this code we would return 0, but unpatched real_lookup just 599 * returns -ENOENT in such a case instead of retrying the lookup. Once 600 * this is dealt with in real_lookup(), all of this ugly mess can go and 601 * we can just check locks in -d_revalidate without doing any RPCs 602 * ever. 603 */ Best Regards, Tao ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Need Help
Hi, Additional logging from the MDS and OSS's is required to really tell whats going on, that said you can try and verify that your OSS nodes can successfully contact your MDS and MGS nodes, lctl ping will indicate this. After that if you find they are successfully contacting each other you can try and abort recovery both on the MDT and OST's you're attempting to mount. (-o abort_recov mount option). -cf On 01/09/2012 04:00 AM, Patrice Hamelin wrote: Hi, I am getting that occasionnally and try to remount another time, which works. I am interested in finding out what's happenning too. Thanks. On 01/07/12 07:19, Ashok nulguda wrote: Dear All, We have Lustre 1.8.4 installed with 2 MDS servers and 2 OSS servers with 17 OSTes and 1 MDT with ha configured on both my MDS and OSS. problem:- Some of my OSTes are not mounting on my OSS servers. When i try to maunully mount it through errors failed: Transport endpoint is not connected commnd :-mount -t lustre /dev/mapper/.. /OST1 failed: Transport endpoint is not connected however, when we login and check MDS server for lustre ost status we found cat /proc/fs/lustre/mds/lustre-MDT/recovery_status It shows completed And also cat /proc/fs/lustre/devices All my mdt and ost are showing up status. Can anyone help us it debuging. Thanks and Regards Ashok -- *Ashok Nulguda * *TATA ELXSI LTD* *Mb : +91 9689945767 Mb : +91 9637095767 Land line : 2702044871 * *Email :ash...@tataelxsi.co.in mailto:tshrik...@tataelxsi.co.in* ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Patrice Hamelin Specialiste sénior en systèmes d'exploitation | Senior OS specialist Environnement Canada | Environment Canada 2121, route Transcanadienne | 2121 Transcanada Highway Dorval, QC H9P 1J3 Téléphone | Telephone 514-421-5303 Télécopieur | Facsimile 514-421-7231 Gouvernement du Canada | Government of Canada ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Finding bugs in Lustre with Coccinelle
On 2012-01-09, at 4:33 AM, Karsten Weiss wrote: I've compiled(^1) Lustre 2.1.0 on CentOS 6.2 with Clang's static analyzer (LLVM 3.0). Here's the bug summary to give you an idea of the result: Bug TypeQuantity All Bugs 594 Dead code Idempotent operation 11 Dead store Dead assignment 76 Dead increment3 Dead initialization 3 Logic error Assigned value is garbage or undefined9 Called function pointer is null (null dereference) 27 Dereference of null pointer 456 Dereference of undefined pointer value1 Division by zero 3 Function call argument is an uninitialized value 4 Garbage return value 1 You can download the full result (annotated source code) here: http://dl.dropbox.com/u/1868416/lustre-2.1.0-scan-build.tar.bz2 (I will delete this file in a couple of days) Karsten, can you please split this tarball into a couple of smaller parts and attach it to http://bugs.whamcloud.com/browse/LU-871, the bug that I previously opened to track defects found by Clang/LLVM. The attachment size limit is 10MB, but splitting into logical parts by the defect type would be ideal. To view the result extract the archive and point your web browser at: lustre-2.1.0-scan-build/2012-01-09-1/index.html Cheers, Karsten ^1: Here's what I used: touch META sh autogen.sh scan-build ./configure --disable-server \ --with-linux=/usr/src/kernels/2.6.32-220.2.1.el6.x86_64 \ --with-linux-obj=/lib/modules/2.6.32-220.2.1.el6.x86_64/build/ \ --with-downstream-release=wc1 mkdir ./lustre-2.1.0-scan-build scan-build -o ./lustre-2.1.0-scan-build/ make -j 24 -- ___creating IT solutions Dipl.-Inf. Karsten Weissscience + computing ag phone:+49 7071 9457 452 Hagellocher Weg 73 teamline: +49 7071 9457 681 72070 Tuebingen email:k.we...@science-computing.de www.science-computing.de -- Vorstand/Board of Management: Dr. Bernd Finkbeiner, Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 Cheers, Andreas -- Andreas Dilger Whamcloud, Inc. Principal Engineer http://www.whamcloud.com/ ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss