Hi Mike,

While installing OFED I have used below command:
# ./mlnxofedinstall -vvv --add-kernel-support --without-32bit --without-fw-update --hpc

I have used option --add-kernel-support, Which add kernel support (Run mlnx_add_kernel_support.sh). This is what you meant to say, right?

Thanks,
Aayush.

On 9/30/2014 11:04 PM, Mike Ware wrote:
I knew I had it somewhere

http://lists.lustre.org/pipermail/lustre-discuss/2012-November/016988.html

Mike

On Tue, Sep 30, 2014 at 10:32 AM, Mike Ware <[email protected] <mailto:[email protected]>> wrote:

    I had a similar issue using the Mellanox packages. If i remember
    correctly I had to recompile the drivers against the Lustre kernel
    for  the install. I believe Mellanox had an article on this but I
    don't have the link.

    Mike

    On Tue, Sep 30, 2014 at 8:07 AM, Parinay Kondekar
    <[email protected]
    <mailto:[email protected]>> wrote:

        IMO you should try out strace to see if anything is noticed.
        "Write failed: Broken pipe" is quite common message and
        difficult to conclude anything with.

        Regards
        parinay

        On Tue, Sep 30, 2014 at 8:16 PM, aayush agrawal
        <[email protected]
        <mailto:[email protected]>> wrote:

            Hi Parinay,

            Yes, I see ib0 in output of ifconfig -a.
            I also tried with options lnet networks=*o2ib_0_*(ib0) but
            no luck.
            While loading lnet I do see error in var/log/messages:

            kernel: LNet: HW CPU cores: 32, npartitions: 4
            alg: No test for crc32 (crc32-table)
            kernel: alg: No test for adler32 (adler32-zlib)
            kernel: alg: No test for crc32 (crc32-pclmul)
            kernel: padlock: VIA PadLock Hash Engine not detected.
            modprobe: FATAL: Error inserting padlock_sha
            (/lib/modules/2.6.32_358/kernel/drivers/crypto/padlock-sha.ko):
            No such device

            But as per below link this should not be a problem?
            https://jira.hpdd.intel.com/browse/LU-1599

            modprobe lnet completes successfully and I see "Write
            failed: Broken pipe" after running "lctl network up" and
            after this session gets logout from the server.

            Thanks,
            Aayush.


            On 9/30/2014 7:21 PM, Parinay Kondekar wrote:
- what is the output of 'ifconfig -a' , do you see ib0 there ? mentioning 'options lnet
            networks=*o2ib_0_*(ib0)'**should be enough.
            - anything in syslog ?

            HTH

            On Tue, Sep 30, 2014 at 6:03 PM, aayush agrawal
            <[email protected]
            <mailto:[email protected]>> wrote:

                Hi,

                I am trying to build lustre 2.5.0 against
                MLNX_OFED_LINUX-2.2-1.0.1-rhel6.4-x86_64 on CentOS6.4
                with kernel version 2.6.32-358.
                But I am not able to set lnet config settings
                properly. I used settings suggested in lustre 2.x
                manual. But then not able to get network up using lctl.

                Details:

                I have two server machines, one for mgs+mdt and
                second for oss and one client machine. I want to
                setup Infiniband on all these machines.
                I could run below steps successfully for all the
                three machines:
                1. Run script mlnxofedinstall
                # ./mlnxofedinstall -vvv --add-kernel-support
                --without-32bit --without-fw-update --hpc
                2. Restart openibd service
                # /etc/init.d/openibd restart
                3. configure ib0 interface.
                4. configure lustre with o2ib
                # ./configure
                --with-linux=Path_to_linux-2.6.32-358.18.1.el6
                --with-o2ib=/usr/src/ofa_kernel/default/

                5. make lustre rpms:
                    # make rpms
                This gave me below compilation error
                I looked online for this error and found bug
                registered on the same:
                https://jira.hpdd.intel.com/browse/LU-4266
                
<https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.hpdd.intel.com_browse_LU-2D4266&d=AAMCAw&c=IGDlg0lD0b-nebmJJ0Kp8A&r=c-1Cg_VH2lcYI_JXS3gypPA6xWmYsO2Md6-EoqjeIzk&m=q_uNuYFdGrDiFyB8x0KjRuPV4TbYGJf20PKQKambrfE&s=Gu0enSN8vm3fdyqEtx0cJjPMhWf9o_TCXmJhHez9HKE&e=>
                Below patch from above link solved the problem and
                hence I could build lustre rpms:
                http://review.whamcloud.com/#/c/8451/1
                
<https://urldefense.proofpoint.com/v2/url?u=http-3A__review.whamcloud.com_-23_c_8451_1&d=AAMCAw&c=IGDlg0lD0b-nebmJJ0Kp8A&r=c-1Cg_VH2lcYI_JXS3gypPA6xWmYsO2Md6-EoqjeIzk&m=q_uNuYFdGrDiFyB8x0KjRuPV4TbYGJf20PKQKambrfE&s=BqWJdkdWSRVMHWQkLWAhYaV0yfRwJZDUb61TfAgRss0&e=>

                Now first I want to do the Infiniband setup for mgs
                and mdt on single machine which also has Ethernet IP.
                Then I want to format and mount mgs and mdt.
                So I installed above created lustre rpms and then
                added below line in /etc/modprobe.d/lustre.conf
                options lnet networks=o2ib(ib0)

                Then I rebooted the machine to remove all lustre
                related modules including lnet and then ranmodprobe
                lnet command to add above parameters and the ran lctl
                network up which is giving me below error:
                LNET configure error 100: Network is down

                I looked online and found below discussion on same error:
                
http://lists.lustre.org/pipermail/lustre-discuss/2010-June/013510.html
                
<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_pipermail_lustre-2Ddiscuss_2010-2DJune_013510.html&d=AAMCAw&c=IGDlg0lD0b-nebmJJ0Kp8A&r=c-1Cg_VH2lcYI_JXS3gypPA6xWmYsO2Md6-EoqjeIzk&m=q_uNuYFdGrDiFyB8x0KjRuPV4TbYGJf20PKQKambrfE&s=aCgXfqCUyJ7IXVRJHjqpk2HCS1_dsKDuaKJrDPmWp4I&e=>

                As per suggestion in above mail I tried with below
                line in /etc/modprobe.d/lustre.conf. In below command
                for IB_IP, I have given infiniband IP.
                options lnet *networks=o2ib(ib0)* routes="tcp0
                IB_IP@o2ib"
                This command hangs for around 2 to 3 minutes and then
                gives error: Write failed: Broken pipe. Same is the
                case for "options lnet *networks=o2ib(ib0)*"
                But if I set: options lnet
                *networks=tcp0(eth0),o2ib(ib0)* routes="tcp1
                IB_IP@o2ib" then it gives LNET configure error 100:
                Network is down.

                It seems that for network=o2ib(ibo) I am getting
                error Write failed: Broken pipe.
                Am I missing anything while following above steps? Or
                how do I resolve above error?

                Thanks,
                Aayush.

                <html>
                _______________________________________________
                HPDD-discuss mailing list
                [email protected]
                <mailto:[email protected]>
                
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.01.org_mailman_listinfo_hpdd-2Ddiscuss&d=AAICAg&c=IGDlg0lD0b-nebmJJ0Kp8A&r=c-1Cg_VH2lcYI_JXS3gypPA6xWmYsO2Md6-EoqjeIzk&m=q_uNuYFdGrDiFyB8x0KjRuPV4TbYGJf20PKQKambrfE&s=0hW3r7x0NhgbZ7zgaZKr9K_fk7_E8bs0f-GAlH89rgM&e=





        _______________________________________________
        Lustre-discuss mailing list
        [email protected]
        <mailto:[email protected]>
        http://lists.lustre.org/mailman/listinfo/lustre-discuss




_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to