Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
Hi Kyle! > And, setting the debug_level flag definitely caused the server to not > respond... I rebooted and tried it again, same thing, setting the > debug_level flag causes the server to crash. (I can still login, but > cannot execute anything, e.g. 'ls', it seems all the cpu's are spinning) > p5l5:~# modprobe hcad_mod nr_ports=1 debug_level= > console output after above command hangs server: > PU0003 000e0252:hipz_h_register_rpage >>> > adapter_handle=10020304 pagesize=0 queue_type=0 > resource_handle=700100018600 logical_address_of_page=e6741000 count=200 > PU0003 000e0078:ehca_hcall_7arg_7ret >>> opcode=1ac > arg1=10020304 arg2=0 arg3=700100018600 arg4=e6741000 > arg5=200 arg6=0 arg7=0 > PU0003 000e0096:ehca_hcall_7arg_7ret <<< opcode=1ac ret=f out1=50 > out2=50 out3=50 out4=50 out5=50 out6=50 out7=50 > PU0003 000e0263:hipz_h_register_rpage <<< ret=f > PU0003 000e04ad:hipz_h_register_rpage_mr <<< ret=f > PU0003 0009076c:ehca_set_pagebuf >>> pginfo=c000eb7b75e0 type=1 > num_pages=1d4000 num_4k=1d4000 next_buf=0 next_4k=30600 number=200 > kpage=c000e6741000 page_cnt=30600 page_4k_cnt=30600 next_listelem=0 > region= next_chunk= next_nmap=0 > PU0003 00090807:ehca_set_pagebuf <<< ret=0 e_mr=c000e1ac2e80 > pginfo=c000eb7b75e0 type=1 num_pages=1d4000 num_4k=1d4000 next_buf=0 > next_4k=30800 number=200 kpage=c000e6742000 page_cnt=30800 > page_4k_cnt=30800 i=200 next_listelem=0 region= > next_chunk= next_nmap=0 > PU0003 000e049e:hipz_h_register_rpage_mr >>> > adapter_handle=10020304 mr=c000e1ac2e80 > mr_handle=700100018600 pagesize=0 queue_type=0 > logical_address_of_page=e6741000 count=200 > PU0003 000e0252:hipz_h_register_rpage >>> > adapter_handle=10020304 pagesize=0 queue_type=0 > resource_handle=700100018600 logical_address_of_page=e6741000 count=200 > PU0003 000e0078:ehca_hcall_7arg_7ret >>> opcode=1ac > arg1=10020304 arg2=0 arg3=700100018600 arg4=e6741000 > arg5=200 arg6=0 arg7=0 > PU0003 000e0096:ehca_hcall_7arg_7ret <<< opcode=1ac ret=f out1=50 > out2=50 out3=50 out4=50 out5=50 out6=50 out7=50 > PU0003 000e0263:hipz_h_register_rpage <<< ret=f > We looked at the traces above and saw a register MR with 0x1d4000 pages, that's about 7,3GB. In this trace part we are at registering the pages 0x30600-0x307FF. So we really guess the system seems to be busy with flushing out the remaining traces and appears to hang while you can do login or ping to it. Fortunately you have an "old" version of ehca that allows selecting debug traces for certain components. In this case I would filter only debug traces for mrmw, and the command for that looks like this: echo 6966 > /sys/bus/ibmebus/drivers/ehca/debug_level ^this should turn on debug traces for mrmw only Or you pass the option debug_level to modprobe: modprobe hcad_mod debug_level=6966 then you should see only mrmw traces in dmesg and that's still a lot, because we do register the whole mem space at module load time. If that still seems to hang, I can provide you with a debug patch later. For now please give us little time to set up test envs and recreate your problem. Thanks! Nam ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
On Oct 23, 2006, at 8:42 AM, Hoang-Nam Nguyen wrote: > Hello Troy! >> The netpipe code is available with mercurial by: >> hg clone http://source.scl.ameslab.gov/hg/netpipe3-pvfs-dev >> Once you have pvfs2-1.5.1 installed, you should be able to do 'make >> pvfs' in the netpipe3-pvfs-dev directory and build NPpvfs. >> The command line arguments I used to reproduce this were: >> ./NPpvfs -d $PVFS_FILE_PATH -l 32768 -u 268435456 -n 100 -o >> $NETPIPE_OUTPUT_FILE > Did you compile pvfs and NPpvfs as 32-bit or 64-bit libs/execs? > I did compile pvfs and NPpvfs as is and realized that pvfs is built > by default as 32-bit and NPpvfs as 64-bit. Hence NPpvfs complained > to find incompatible pvfs libs. > Regards > Nam > I wasn't able to get reliable backtraces out of a 64 bit NPpvfs and pvfs libs, so I rebuilt as 32 bit, and now I get much more interesting errors and kernel logs.. If I start 4 netpipe processes on the same node with: ./NPpvfs -l 32768 -u 268435456 -n 100 -o results/proc2.w.out -I -d / pvfs2/6node/proc2 I get errors like: 27: 786429 bytes100 times --> 2249.96 Mbps in2666.70 usec 28: 786432 bytes100 times --> [E 18:47:20.394586] Error: ib_check_cq: entry id 0x100ac7f0 opcode RDMA WRITE error IBV_WC_LOC_PROT_ERR. [E 18:47:20.395051] [bt] ./NPpvfs(error+0x9c) [0x1005858c] [E 18:47:20.395087] [bt] ./NPpvfs [0x10056a00] [E 18:47:20.395118] [bt] ./NPpvfs [0x1005726c] And kernel logs like this: Oct 23 18:48:37 p5l8 kernel: PU0007 00060066:print_error_data HCAD_ERROR QP 0xdfe (resource=2dfe) has errors. Oct 23 18:48:37 p5l8 kernel: PU0007 00060077:print_error_data HCAD_ERROR Error data is available: 2dfe. Oct 23 18:48:37 p5l8 kernel: PU0007 00060079:print_error_data HCAD_ERROR EHCA - error data begin --- Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f000 ofs= 04d0 2dfe Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f010 ofs=0010 01000310 8000 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f020 ofs=0020 a005 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f030 ofs=0030 0100 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f040 ofs=0040 0001 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f050 ofs=0050 0014 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f060 ofs=0060 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f070 ofs=0070 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f080 ofs=0080 0080262b 00ff Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f090 ofs=0090 00ff 09f49900 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f0a0 ofs=00a0 000e0492 000a Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f0b0 ofs=00b0 0001 002b Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f0c0 ofs=00c0 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f0d0 ofs=00d0 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f0e0 ofs=00e0 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f0f0 ofs=00f0 0003 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f100 ofs=0100 001a 0004 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f110 ofs=0110 0004 0032 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f120 ofs=0120 dc9d4600 03c32f28 Oct 23 18:48:37 p5l8 kernel: PU0007 0006007a:print_error_data resource=2dfe adr=c0012ec3f130 ofs=0130 0009f4aa 00
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
Hoang-Nam Nguyen wrote: > Hi Troy! > >> The netpipe code is available with mercurial by: >> hg clone http://source.scl.ameslab.gov/hg/netpipe3-pvfs-dev >> Once you have pvfs2-1.5.1 installed, you should be able to do 'make >> pvfs' in the netpipe3-pvfs-dev directory and build NPpvfs. >> The command line arguments I used to reproduce this were: >> ./NPpvfs -d $PVFS_FILE_PATH -l 32768 -u 268435456 -n 100 -o >> $NETPIPE_OUTPUT_FILE >> > Thanks for this. I've been struggling with setting up the systems > to recreate this problem. Please be patient. > Can you please send me the ouput of modinfo ib_ehca (or hcad_mod > in older version)? Also the firmware code level as plained in > previous email. How many memory have you assigned to the partition? > With those data I'd be able to have nearly the same envs like yours. > >> This is the dmesg log: >> PU0001 000e0091:ehca_hcall_7arg_7ret HCAD_ERROR opcode=160 >> ret=fff7 arg1=1304 arg2=5 arg3=4000f83 >> arg4=1 arg5=e0 arg6=eb6b6920 arg7=0 out1=0 out2=0 >> out3=0 out4=0 out5=0 out6=0 out7=0 >> PU0001 00090454:ehca_reg_mr HCAD_ERROR hipz_alloc_mr failed, >> h_ret=fff7 hca_hndl=1304 >> PU0001 00090478:ehca_reg_mr <<< ret=ffea shca=c000e796b000 >> e_mr=c000ce865e80 iova_start=04000f83 size=1 acl=7 >> e_pd=c000eb6b6920 pginfo=c000dfcb3a70 num_pages=10 num_4k=10 >> PU0001 00090176:ehca_reg_user_mr <<< rc=ffea >> pd=c000eb6b6920 region=c000ce861dd0 mr_access_flags=7 >> udata=c000dfcb3ba0 >> > I got this already from you and Kyle. I meant the full log with > debug traces enabled: modprobe ib_ehca debug_level=1 or for older > versions modprobe hcad_mod debug_level=99. If > possible, try to get it. Anyway I'll do that with my test env. > Thanks! > Nam > > > I believe we have 8GB allocated on each this box(all memory and cpus allocated to one partition ), and we're running firmware version SF240_233. p5l5:~# modinfo hcad_mod filename: /lib/modules/2.6.17/kernel/drivers/infiniband/hw/ehca/hcad_mod.ko version:SVNEHCA_0009 description:IBM eServer HCA InfiniBand Device Driver author: Christoph Raisch <[EMAIL PROTECTED]> license:Dual BSD/GPL srcversion: 2B35F7963CEB9E6067F3F92 depends:ib_core vermagic: 2.6.17 SMP mod_unload gcc-4.0 parm: open_aqp1:AQP1 on startup (0: no (default), 1: yes) (int) parm: debug_level:debug level (0: node, 6: only errors (default), 9: all) (int) parm: hw_level:hardware level (0: autosensing (default), 1: v. 0.20, 2: v. 0.21) (int) parm: nr_ports:number of connected ports (default: 2) (int) parm: use_hp_mr:high performance MRs (0: no (default), 1: yes) (int) parm: port_act_time:time to wait for port activation (default: 30 sec) (int) parm: poll_all_eqs:polls all event queues periodically (0: no, 1: yes (default)) (int) parm: static_rate:set permanent static rate (default: disabled) (int) And, setting the debug_level flag definitely caused the server to not respond... I rebooted and tried it again, same thing, setting the debug_level flag causes the server to crash. (I can still login, but cannot execute anything, e.g. 'ls', it seems all the cpu's are spinning) p5l5:~# modprobe hcad_mod nr_ports=1 debug_level= console output after above command hangs server: PU0003 000e0252:hipz_h_register_rpage >>> adapter_handle=10020304 pagesize=0 queue_type=0 resource_handle=700100018600 logical_address_of_page=e6741000 count=200 PU0003 000e0078:ehca_hcall_7arg_7ret >>> opcode=1ac arg1=10020304 arg2=0 arg3=700100018600 arg4=e6741000 arg5=200 arg6=0 arg7=0 PU0003 000e0096:ehca_hcall_7arg_7ret <<< opcode=1ac ret=f out1=50 out2=50 out3=50 out4=50 out5=50 out6=50 out7=50 PU0003 000e0263:hipz_h_register_rpage <<< ret=f PU0003 000e04ad:hipz_h_register_rpage_mr <<< ret=f PU0003 0009076c:ehca_set_pagebuf >>> pginfo=c000eb7b75e0 type=1 num_pages=1d4000 num_4k=1d4000 next_buf=0 next_4k=30600 number=200 kpage=c000e6741000 page_cnt=30600 page_4k_cnt=30600 next_listelem=0 region= next_chunk= next_nmap=0 PU0003 00090807:ehca_set_pagebuf <<< ret=0 e_mr=c000e1ac2e80 pginfo=c000eb7b75e0 type=1 num_pages=1d4000 num_4k=1d4000 next_buf=0 next_4k=30800 number=200 kpage=c000e6742000 page_cnt=30800 page_4k_cnt=30800 i=200 next_listelem=0 region= next_chunk= next_nmap=0 PU0003 000e049e:hipz_h_register_rpage_mr >>> adapter_handle=10020304 mr=c000e1ac2e80 mr_handle=700100018600 pagesize=0 queue_type=0 logical_address_of_page=e6741000 count=200 PU0003 000e0252:hipz_h_register_rpage >>> adapter_handle=10020304 pagesize=0 queue_type=0 resource_handle=700100018600 logical_address_of_page=e6741000 count=200 PU0003 000e0078:ehca_hcall_
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
Hello Troy! > The netpipe code is available with mercurial by: > hg clone http://source.scl.ameslab.gov/hg/netpipe3-pvfs-dev > Once you have pvfs2-1.5.1 installed, you should be able to do 'make > pvfs' in the netpipe3-pvfs-dev directory and build NPpvfs. > The command line arguments I used to reproduce this were: > ./NPpvfs -d $PVFS_FILE_PATH -l 32768 -u 268435456 -n 100 -o > $NETPIPE_OUTPUT_FILE Did you compile pvfs and NPpvfs as 32-bit or 64-bit libs/execs? I did compile pvfs and NPpvfs as is and realized that pvfs is built by default as 32-bit and NPpvfs as 64-bit. Hence NPpvfs complained to find incompatible pvfs libs. Regards Nam ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
Hi Troy! > The netpipe code is available with mercurial by: > hg clone http://source.scl.ameslab.gov/hg/netpipe3-pvfs-dev > Once you have pvfs2-1.5.1 installed, you should be able to do 'make > pvfs' in the netpipe3-pvfs-dev directory and build NPpvfs. > The command line arguments I used to reproduce this were: > ./NPpvfs -d $PVFS_FILE_PATH -l 32768 -u 268435456 -n 100 -o > $NETPIPE_OUTPUT_FILE Thanks for this. I've been struggling with setting up the systems to recreate this problem. Please be patient. Can you please send me the ouput of modinfo ib_ehca (or hcad_mod in older version)? Also the firmware code level as plained in previous email. How many memory have you assigned to the partition? With those data I'd be able to have nearly the same envs like yours. > This is the dmesg log: > PU0001 000e0091:ehca_hcall_7arg_7ret HCAD_ERROR opcode=160 > ret=fff7 arg1=1304 arg2=5 arg3=4000f83 > arg4=1 arg5=e0 arg6=eb6b6920 arg7=0 out1=0 out2=0 > out3=0 out4=0 out5=0 out6=0 out7=0 > PU0001 00090454:ehca_reg_mr HCAD_ERROR hipz_alloc_mr failed, > h_ret=fff7 hca_hndl=1304 > PU0001 00090478:ehca_reg_mr <<< ret=ffea shca=c000e796b000 > e_mr=c000ce865e80 iova_start=04000f83 size=1 acl=7 > e_pd=c000eb6b6920 pginfo=c000dfcb3a70 num_pages=10 num_4k=10 > PU0001 00090176:ehca_reg_user_mr <<< rc=ffea > pd=c000eb6b6920 region=c000ce861dd0 mr_access_flags=7 > udata=c000dfcb3ba0 I got this already from you and Kyle. I meant the full log with debug traces enabled: modprobe ib_ehca debug_level=1 or for older versions modprobe hcad_mod debug_level=99. If possible, try to get it. Anyway I'll do that with my test env. Thanks! Nam ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
>>> I'm not sure the standard OpenIB NetPIPE runs can reproduce this >>> type of workload. However, we have developed a working PVFS2- >>> NetPIPE module which can reproduce this problem on occassion, if >>> there is interest in further testing this on your end, I can make >>> it available. > Yes. Please send it to me. I'd like to test it. Is it a user space > appl.? > I want to see if we could reach the limit of mappings mentioned above. The netpipe code is available with mercurial by: hg clone http://source.scl.ameslab.gov/hg/netpipe3-pvfs-dev Once you have pvfs2-1.5.1 installed, you should be able to do 'make pvfs' in the netpipe3-pvfs-dev directory and build NPpvfs. The command line arguments I used to reproduce this were: ./NPpvfs -d $PVFS_FILE_PATH -l 32768 -u 268435456 -n 100 -o $NETPIPE_OUTPUT_FILE This is the dmesg log: PU0001 000e0091:ehca_hcall_7arg_7ret HCAD_ERROR opcode=160 ret=fff7 arg1=1304 arg2=5 arg3=4000f83 arg4=1 arg5=e0 arg6=eb6b6920 arg7=0 out1=0 out2=0 out3=0 out4=0 out5=0 out6=0 out7=0 PU0001 00090454:ehca_reg_mr HCAD_ERROR hipz_alloc_mr failed, h_ret=fff7 hca_hndl=1304 PU0001 00090478:ehca_reg_mr <<< ret=ffea shca=c000e796b000 e_mr=c000ce865e80 iova_start=04000f83 size=1 acl=7 e_pd=c000eb6b6920 pginfo=c000dfcb3a70 num_pages=10 num_4k=10 PU0001 00090176:ehca_reg_user_mr <<< rc=ffea pd=c000eb6b6920 region=c000ce861dd0 mr_access_flags=7 udata=c000dfcb3ba0 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
Hello Troy and Kyle! > > Kyle wrote: > > Our app writes out a file once, then reads it in many times through > > the pvfs2 system. In the pvfs2 layers, there is memory caching > > done at the network level, so memory is registered by the app, and > > attempts are made to re-register and/or re-use these memory regions > > to save on memory reg overhead. The problem occurs only while > > writing files, so while memory is being initially registered with > > the nic/app and cached? Also, our tests show that the app runs > > normally to completion on identical machines using mellanox hca's > > instead of the eHCA. The file sizes are generally >16GByte, > > however our failures usually appear by the time ~220-250MBytes have > > been written(possibly also all registered)? We have tested memory registration with 64GB. So I don't think ~16GB is an issue. However we do have a restriction of mappings in that the total number of mappings is twice of the total number of pages assigned to the partition. The term mappings means the number of pages in the calls to ib_reg_phys_mr() or ib_reg_user_mr()/ibv_reg_mr(). ehca driver does register the whole space at module load time so that for user space applications you have a limit of mappings equal the total number of physical pages. Note that kernel modules sitting on top of ehca eg. ib_ipoib, ib_mad don't suffer under this limit since they share the whole space registered by ehca as they call ib_get_dma_mr(). > > I'm not sure the standard OpenIB NetPIPE runs can reproduce this > > type of workload. However, we have developed a working PVFS2- > > NetPIPE module which can reproduce this problem on occassion, if > > there is interest in further testing this on your end, I can make > > it available. Yes. Please send it to me. I'd like to test it. Is it a user space appl.? I want to see if we could reach the limit of mappings mentioned above. > > Our ehca's have the following revision info: > >vendor_id: 0x5076 > >vendor_part_id: 0 > >hw_ver: 0x103 > > Kernel version is debian 2.6.17 ok. For completeness please give me the driver version using modinfo and also the firmware code level via HMC - click on "Licensed Internal Code Maintenance" (left pane), "Change Licensed Internal Code" (right pane), select your frame and then "View System Info", "Display Current Values". You can also turn on the debug traces of ehca to track all reg_mr() calls in order to determine if you reach the limit of mappings mentioned above. Or just send me the whole dmesg resp. /var/log/messages. > Troy wrote: > What are the limits on the ehca memory registrations? > Is there a limit to the number of regions that can be registered? See above > Is > there any way (with kernel hacks) that we can register the entire > address space of the application? We would like to be able to do RDMA > sends and receives from anywhere in the application address space > eventually, and only register it once. ib_ipoib or ib_mad actually uses ib_get_dma_mr() and obtain the whole space. For user space there is no corresponding api yet. > What is the point of RDMA for memory-intensive applications if you > have to copy the data to a registered buffer before sending it anyway? Not sure if I understand completely... However RDMA mr accesses should not be an issue. Thanks! Nam ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
> (I am taking this back to the openib list because I think the list > needs to hear about real applications that are hitting memory > registration limits) > > What are the limits on the ehca memory registrations? > Is there a limit to the number of regions that can be registered? The numbe rof regions should not be limited, The total size of regions is limited, all user applications together can only register the complete available physical memory once. The rationale behind that is that you can give away physical memory only once to a application. Registering shared memory regions on a "physical" memory region should be unlimited as well. > Is > there any way (with kernel hacks) that we can register the entire > address space of the application? I'd guess you mean physical available memory space. Would be definetly hard to "pin" virtual memory provided by swapping. > We would like to be able to do RDMA > sends and receives from anywhere in the application address space > eventually, and only register it once. Yes, that's the fastest way to use IB. But keep in mind that registered memory is pinned and can't be given to "helper" tasks, like sshd. So you have to restrict you application to max memory minus the memory needed by base kernel+ daemons+bash+... to be able to "breathe". > > What is the point of RDMA for memory-intensive applications if you > have to copy the data to a registered buffer before sending it anyway? > Regards . . . Christoph Raisch ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
(I am taking this back to the openib list because I think the list needs to hear about real applications that are hitting memory registration limits) What are the limits on the ehca memory registrations? Is there a limit to the number of regions that can be registered? Is there any way (with kernel hacks) that we can register the entire address space of the application? We would like to be able to do RDMA sends and receives from anywhere in the application address space eventually, and only register it once. What is the point of RDMA for memory-intensive applications if you have to copy the data to a registered buffer before sending it anyway? On Oct 18, 2006, at 11:27 AM, Kyle Schochenmaier wrote: > Hoang-Nam Nguyen wrote: >> Hi Troy! >> >>> I am running PVFS2 on OpenIB, with IBM's ehca. >>> When we start writing/reading large files, either with the NetPIPE >>> PVFS module we have or a modified GAMESS executable that uses >>> libpvfs2 directly, the 'ibv_reg_mr' function fails, and we get an >>> error. >>> This is also correlated with kernel log messages like this: >>> Oct 16 11:14:45 p5l8 kernel: PU0003 000e0091:ehca_hcall_7arg_7ret >>> HCAD_ERROR opco >>> de=160 ret=fff7 arg1=1304 arg2=5 >>> arg3=14f0ebc8 arg4=1 >>> arg5=e0 arg6=e3e9f200 arg7=0 out1=0 out2=0 out3=0 out4=0 >>> out5=0 out6=0 >>> out7=0 >>> >> Return code f7 from firmware/hvcall means H_NO_MEM. I'm wondering >> if you could provide me with some pre-history of this problem. >> Is this a permanent problem? If yes, could you give me more infos >> on your testcase resp. scenario eg large file size, NetPIPE options? >> Which version of ehca are you using? And which kernel version? >> Thanks! >> Hoang-Nam Nguyen >> >> > I think Troy could better explain what is happening here, so I'm > taking this off-list for now -- we're trying to get this working > for SC'06, so time is limited :) -- if Troy wants to forward this > on to the list after looking at it, thats fine too. > Our app writes out a file once, then reads it in many times through > the pvfs2 system. In the pvfs2 layers, there is memory caching > done at the network level, so memory is registered by the app, and > attempts are made to re-register and/or re-use these memory regions > to save on memory reg overhead. The problem occurs only while > writing files, so while memory is being initially registered with > the nic/app and cached? Also, our tests show that the app runs > normally to completion on identical machines using mellanox hca's > instead of the eHCA. The file sizes are generally >16GByte, > however our failures usually appear by the time ~220-250MBytes have > been written(possibly also all registered)? > > I'm not sure the standard OpenIB NetPIPE runs can reproduce this > type of workload. However, we have developed a working PVFS2- > NetPIPE module which can reproduce this problem on occassion, if > there is interest in further testing this on your end, I can make > it available. > > Our ehca's have the following revision info: >vendor_id: 0x5076 >vendor_part_id: 0 >hw_ver: 0x103 > Kernel version is debian 2.6.17 > > I hope this is enough info to get some more insight from everyone. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
Hi Troy! > I am running PVFS2 on OpenIB, with IBM's ehca. > When we start writing/reading large files, either with the NetPIPE > PVFS module we have or a modified GAMESS executable that uses > libpvfs2 directly, the 'ibv_reg_mr' function fails, and we get an error. > This is also correlated with kernel log messages like this: > Oct 16 11:14:45 p5l8 kernel: PU0003 000e0091:ehca_hcall_7arg_7ret > HCAD_ERROR opco > de=160 ret=fff7 arg1=1304 arg2=5 > arg3=14f0ebc8 arg4=1 > arg5=e0 arg6=e3e9f200 arg7=0 out1=0 out2=0 out3=0 out4=0 > out5=0 out6=0 > out7=0 Return code f7 from firmware/hvcall means H_NO_MEM. I'm wondering if you could provide me with some pre-history of this problem. Is this a permanent problem? If yes, could you give me more infos on your testcase resp. scenario eg large file size, NetPIPE options? Which version of ehca are you using? And which kernel version? Thanks! Hoang-Nam Nguyen ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general