Re: [OMPI devel] MPI_Bcast using TIPC
Thank you all for replying, I think I know where to start now. I will post when I get more information :) /Xin On 09/27/2011 05:49 PM, Christian Siebert wrote: Hi Xin, I think you are referring to the multicast functionality of TIPC, right? That would be great if it works properly. You might also want to compare to existing work that has been done some years ago - see this anncouncement: http://www.open-mpi.org/community/lists/devel/2007/03/1358.php Good luck! Christian Hi all, sorry that it takes so long for paper works of TIPC BTL. Hopefully I can get some feedback this week. However, I would like to do some work while waiting. For future work, I would like to add TIPC_broadcast to support MPI_Bcast. But as I looked into TCP BTL code, I could not find any code doing broadcast. Does that mean MPI_Bcast is actually done by sending one by one? In that case, in I want to add actual broadcast support, which part should I change, I guess not just adding a BTL but also modify PML as well? Thank you in advance! /Xin
[OMPI devel] MPI_Bcast using TIPC
Hi all, sorry that it takes so long for paper works of TIPC BTL. Hopefully I can get some feedback this week. However, I would like to do some work while waiting. For future work, I would like to add TIPC_broadcast to support MPI_Bcast. But as I looked into TCP BTL code, I could not find any code doing broadcast. Does that mean MPI_Bcast is actually done by sending one by one? In that case, in I want to add actual broadcast support, which part should I change, I guess not just adding a BTL but also modify PML as well? Thank you in advance! /Xin
Re: [OMPI devel] TIPC BTL code ready for review
->400.78 Mbps in 117.02 usec 61:8189 bytes285 times -->458.70 Mbps in 136.20 usec 62:8192 bytes367 times -->460.25 Mbps in 135.80 usec 63:8195 bytes368 times -->461.14 Mbps in 135.58 usec 64: 12285 bytes368 times -->497.80 Mbps in 188.28 usec 65: 12288 bytes354 times -->495.96 Mbps in 189.03 usec 66: 12291 bytes352 times -->498.39 Mbps in 188.15 usec 67: 16381 bytes177 times -->562.50 Mbps in 222.18 usec 68: 16384 bytes225 times -->563.89 Mbps in 221.68 usec 69: 16387 bytes225 times -->562.61 Mbps in 222.22 usec 70: 24573 bytes225 times -->629.04 Mbps in 298.04 usec 71: 24576 bytes223 times -->632.04 Mbps in 296.66 usec 72: 24579 bytes224 times -->628.97 Mbps in 298.14 usec 73: 32765 bytes111 times -->667.51 Mbps in 374.49 usec 74: 32768 bytes133 times -->668.03 Mbps in 374.24 usec 75: 32771 bytes133 times -->667.54 Mbps in 374.54 usec 76: 49149 bytes133 times -->706.32 Mbps in 530.89 usec 77: 49152 bytes125 times -->705.28 Mbps in 531.70 usec 78: 49155 bytes125 times -->706.43 Mbps in 530.87 usec 79: 65533 bytes 62 times -->746.28 Mbps in 669.96 usec 80: 65536 bytes 74 times -->750.98 Mbps in 665.80 usec 81: 65539 bytes 75 times -->745.64 Mbps in 670.59 usec 82: 98301 bytes 74 times -->786.29 Mbps in 953.81 usec 83: 98304 bytes 69 times -->786.03 Mbps in 954.17 usec 84: 98307 bytes 69 times -->785.73 Mbps in 954.56 usec 85: 131069 bytes 34 times -->822.93 Mbps in1215.15 usec 86: 131072 bytes 41 times -->825.56 Mbps in1211.31 usec 87: 131075 bytes 41 times -->822.65 Mbps in1215.61 usec 88: 196605 bytes 41 times -->847.04 Mbps in1770.85 usec 89: 196608 bytes 37 times -->849.10 Mbps in1766.57 usec 90: 196611 bytes 37 times -->846.81 Mbps in1771.38 usec 91: 262141 bytes 18 times -->853.36 Mbps in2343.64 usec 92: 262144 bytes 21 times -->853.44 Mbps in2343.45 usec 93: 262147 bytes 21 times -->853.69 Mbps in2342.81 usec 94: 393213 bytes 21 times -->865.59 Mbps in3465.83 usec 95: 393216 bytes 19 times -->865.40 Mbps in3466.61 usec 96: 393219 bytes 19 times -->865.48 Mbps in3466.31 usec 97: 524285 bytes 9 times -->871.99 Mbps in4587.17 usec 98: 524288 bytes 10 times -->871.85 Mbps in4587.95 usec 99: 524291 bytes 10 times -->872.13 Mbps in4586.50 usec 100: 786429 bytes 10 times -->878.77 Mbps in6827.70 usec 101: 786432 bytes 9 times -->879.14 Mbps in6824.83 usec 102: 786435 bytes 9 times -->878.82 Mbps in6827.39 usec 103: 1048573 bytes 4 times -->884.29 Mbps in9046.74 usec 104: 1048576 bytes 5 times -->884.41 Mbps in9045.60 usec 105: 1048579 bytes 5 times -->884.15 Mbps in9048.29 usec 106: 1572861 bytes 5 times -->887.90 Mbps in 13514.99 usec 107: 1572864 bytes 4 times -->887.90 Mbps in 13515.01 usec 108: 1572867 bytes 4 times -->887.81 Mbps in 13516.38 usec 109: 2097149 bytes 3 times -->889.80 Mbps in 17981.51 usec 110: 2097152 bytes 3 times -->889.91 Mbps in 17979.33 usec 111: 2097155 bytes 3 times -->889.90 Mbps in 17979.65 usec 112: 3145725 bytes 3 times -->892.55 Mbps in 26889.17 usec 113: 3145728 bytes 3 times -->892.60 Mbps in 26887.83 usec 114: 3145731 bytes 3 times -->892.57 Mbps in 26888.68 usec 115: 4194301 bytes 3 times -->893.98 Mbps in 35795.15 usec 116: 4194304 bytes 3 times -->893.95 Mbps in 35796.01 usec 117: 4194307 bytes 3 times -->893.94 Mbps in 35796.66 usec 118: 6291453 bytes 3 times -->895.36 Mbps in 53609.49 usec 119: 6291456 bytes 3 times -->895.36 Mbps in 53609.49 usec 120: 6291459 bytes 3 times -->895.39 Mbps in 53608.00 usec 121: 8388605 bytes 3 times -->895.86 Mbps in 71439.65 usec 122: 8388608 bytes 3 times -->895.87 Mbps in 71438.84 usec 123: 8388611 bytes 3 times -->895.80 Mbps in 71444.32 usec On 09/01/2011 04:08 PM, Jeff Squyres wrote: On Sep 1, 2011, at 7:05 AM, Xin He wrote: And get the result as in appendix. It seems that TCP has better performances with smaller message while TIPC with larger message. Interesting. Any idea why? From the TIPC paper you sent, one of TIPC's strengths was that it was supposed to be faster than TCP for small messages. Do you know what the raw performance numbers are for TCP and TIPC on this machine without MPI?
Re: [OMPI devel] TIPC BTL code ready for review
hi, I found the reason. It is because besides the direct links between 2 PCs, there is another link going through many switches and TCP BTL seems to use this slower link. So I run again with eth0 only. So I build ompi with: ./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio --prefix=/usr --with-platform=optimized And ran with : mpirun -n 6 --mca btl tcp,self --mca btl_tcp_if_include eth0 -hostfile my_hostfile --bynode ./IMB-MPI1 > tcp_0901 And get the result as in appendix. It seems that TCP has better performances with smaller message while TIPC with larger message. /Xin On 08/30/2011 05:50 PM, Jeff Squyres wrote: On Aug 29, 2011, at 3:51 AM, Xin He wrote: - $ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0 hostname svbu-mpi008 svbu-mpi009 $ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0 IMB-MPI1 PingPong #--- #Intel (R) MPI Benchmark Suite V3.2, MPI-1 part #--- Hi, I think these models are reasonably new :) The result I gave you, they are tested on 2 processes but on 2 different servers. I get that the result you showed is 2 processes on one machine? Nope -- check my output -- I'm running across 2 different servers and through a 1GB TOR ethernet switch (it's not a particularly high-performance ethernet switch, either). Can you run some native netpipe TCP numbers across the same nodes that you ran the TIPC MPI tests over? You should be getting lower latency than what you're seeing. Do you have jumbo frames enabled, perchance? Are you going through only 1 switch? If you're on a NUMA server, do you have processor affinity enabled, and have the processes located "near" the NIC? BTW, I forgot to tell you about SM& TIPC. Unfortunately, TIPC does not beat SM... That's probably not surprising; SM is tuned pretty well specifically for MPI communication across shared memory. #--- #Intel (R) MPI Benchmark Suite V3.2, MPI-1 part #--- # Date : Thu Sep 1 10:42:40 2011 # Machine : i686 # System: Linux # Release : 2.6.32-24-generic-pae # Version : #39-Ubuntu SMP Wed Jul 28 07:39:26 UTC 2010 # MPI Version : 2.1 # MPI Thread Environment: MPI_THREAD_SINGLE # New default behavior from Version 3.2 on: # the number of iterations per message size is cut down # dynamically when a certain run time (per message size sample) # is expected to be exceeded. Time limit is defined by variable # "SECS_PER_SAMPLE" (=> IMB_settings.h) # or through the flag => -time # Calling sequence was: # ./IMB-MPI1 # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions: MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong # PingPing # Sendrecv # Exchange # Allreduce # Reduce # Reduce_scatter # Allgather # Allgatherv # Gather # Gatherv # Scatter # Scatterv # Alltoall # Alltoallv # Bcast # Barrier #--- # Benchmarking PingPong # #processes = 2 # ( 4 additional processes waiting in MPI_Barrier) #--- #bytes #repetitions t[usec] Mbytes/sec 0 100051.32 0.00 1 100051.80 0.02 2 100051.75 0.04 4 100051.64 0.07 8 100051.87 0.15 16 100051.62 0.30 32 100052.14 0.59 64 100051.88 1.18 128 100052.81 2.31 256 100054.87 4.45 512 100057.65 8.47 1024 100074.7013.07 2048 100090.9121.49 4096 1000 115.3633.86 8192 1000 147.9652.80 16384 1000 228.9668.24 32768 1000 390.8479.96 65536 640 789.7179.14 131072 320 1349.1092.65 262144 160 2479.60 100.82 524288 80 4722.49 105.88 1048576 40 9181.69 108.91 2097152 20 18110.10 110.44 4194304 10 35916.29 111.37 #--- # Benchmarking PingPing # #processes = 2 # ( 4 additional
Re: [OMPI devel] TIPC BTL code ready for review
Yes, it is Gigabytes Ethernet. I configure ompi again using "./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio --prefix=/usr --with-platform=optimized" and run IMB-MPI1 again with "mpirun --mca btl tcp,self -n 2 --hostfile my_hostfile --bynode ./IMB-MPI1" again, the result does not seem very different though... About TIPC, maybe this article can explains more: http://www.kernel.org/doc/ols/2004/ols2004v2-pages-61-70.pdf To use TIPC, you use "tipcutil" to configure: I first connect 2 machines directly with wires 1. set tipc address of 2 PC. Say <1.1.1> and <1.1.2> respectively and with the same Network ID 2. "tipc-config -v -i -be=eth:eth0,eth:eth1" at each machine to set the bears. Check http://tipc.sourceforge.net/doc/tipc_1.7_users_guide.html#installation for more information 3. "tipc-config -l" to check links. If successful, you should see: ehhexxn@node2:~/git/test_ompi/IMB_3.2/src$ tipc-config -l Links: multicast-link: up 1.1.2:eth0-1.1.1:eth0: up 1.1.2:eth1-1.1.1:eth1: up In the attachment, there are sample programs using TIPC that can be used to test TIPC environment :) /Xin On 08/29/2011 03:22 PM, teng ma wrote: Is your interconnect Gigabytes Ethernet? It's very surprised to see TCP BTL just got 33MBytes peak BW on your cluster. I did a similar test on an amd cluster with gigabytes Ethernet. As following shows, the TCP BTL's BW is similar with your tipc(112MBytes/s). Could you redo the test with 2 processes spawned, 2 nodes in your machinefile and enabling --bynode? It looks like your tipc BTL is pretty good at message size between 8K and 512K. Can you tell us more about difference between TIPC and TCP protocol stacks? Any special configure needed to enable your tipc? Maybe you can write a module in Netpipe( similar to NPTcp )to test raw performance on both TCP and TIPC without MPI. TCP BTL on a Gigbytes ethernet #--- # Benchmarking PingPong # #processes = 2 #--- #bytes #repetitions t[usec] Mbytes/sec 0 100023.27 0.00 1 100023.78 0.04 2 100023.77 0.08 4 100025.47 0.15 8 100023.94 0.32 16 100024.36 0.63 32 100024.83 1.23 64 100025.76 2.37 128 100027.25 4.48 256 100030.66 7.96 512 100036.8613.25 1024 100049.0019.93 2048 100077.8325.10 4096 100082.4247.39 8192 1000 165.2847.27 16384 1000 325.0148.08 32768 1000 440.7570.90 65536 640 1060.0058.96 131072 320 1674.7174.64 262144 160 2814.1388.84 524288 80 4975.11 100.50 1048576 40 9526.94 104.97 2097152 20 18419.33 108.58 4194304 10 36150.05 110.65 83886085 71880.79 111.30 Teng On Mon, Aug 29, 2011 at 3:51 AM, Xin He <xin.i...@ericsson.com <mailto:xin.i...@ericsson.com>> wrote: On 08/25/2011 03:14 PM, Jeff Squyres wrote: On Aug 25, 2011, at 8:25 AM, Xin He wrote: Can you edit your configure.m4 directly and test it and whatnot? I provided the configure.m4 as a starting point for you. :-) It shouldn't be hard to make it check linux/tipc.h instead of tipc.h. I'm happy to give you direct write access to the bitbucket, if you want it. I think me having write access is convenient for both of us :) Sure -- what's your bitbucket account ID? It's "letter113" As we've discussed off-list, we can't take the code upstream until the contributor agreement is signed, unfortunately. The agreement thing is ongoing right now, but it may take some time. No worries. Lawyers tend to take time when reviewing this stuff; we've seen this pattern in most organizations who sign the OMPI agreement. But to save time, can you guys do some test on TIPC BTL, so that when the agreement is ready, the code can be used? I don't know if any of us has the TIPC support libraries installed. It is easy to have TIPC support. I
Re: [OMPI devel] TIPC BTL code ready for review
On 08/25/2011 03:14 PM, Jeff Squyres wrote: On Aug 25, 2011, at 8:25 AM, Xin He wrote: Can you edit your configure.m4 directly and test it and whatnot? I provided the configure.m4 as a starting point for you. :-) It shouldn't be hard to make it check linux/tipc.h instead of tipc.h. I'm happy to give you direct write access to the bitbucket, if you want it. I think me having write access is convenient for both of us :) Sure -- what's your bitbucket account ID? It's "letter113" As we've discussed off-list, we can't take the code upstream until the contributor agreement is signed, unfortunately. The agreement thing is ongoing right now, but it may take some time. No worries. Lawyers tend to take time when reviewing this stuff; we've seen this pattern in most organizations who sign the OMPI agreement. But to save time, can you guys do some test on TIPC BTL, so that when the agreement is ready, the code can be used? I don't know if any of us has the TIPC support libraries installed. It is easy to have TIPC support. It is within the kernel actually. To get TIPC working, you only have to configure it by using "tipc-config". Maybe you can check this doc for information: http://tipc.sourceforge.net/doc/Users_Guide.txt So... what *is* TIPC? Is there a writeup anywhere that we can read about what it is / how it works? For example, what makes TIPC perform better than TCP? Sure. Search "TIPC: Providing Communication for Linux Clusters". It is a paper written by the author of TIPC, explaining basic stuff about TIPC, should be very useful. And you can visit TIPC homepage: http://tipc.sourceforge.net/ . I have done some tests using tools like NetPIPE, osu and IMB and the result shows that TIPC BTL has a better performance than TCP BTL. Great! Can you share any results? Yes, please check the appendix for the results using IMB 3.2. I have done the tests on 2 computers. Dell SC1435 Dual-Core AMD Opteron(tm) Processor 2212 HE x 2 4 GB Mem Ubuntu Server 10.04 LTS 32-bit Linux 2.6.32-24 I'm not familiar with the Dell or Opteron lines -- how recent are those models? I ask because your TCP latency is a bit high (about 85us in 2-process IMB PingPong); it might suggest older hardware. Or you may have built a debugging version of Open MPI (if you have a .svn or .hg checkout, that's the default). See the HACKING top-level file for how to get an optimized build. For example, with my debug build of Open MPI on fairly old Xeons with 1GB ethernet, I'm getting the following PingPong results (note: this is a debug build; it's not even an optimized build): - $ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0 hostname svbu-mpi008 svbu-mpi009 $ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0 IMB-MPI1 PingPong #--- #Intel (R) MPI Benchmark Suite V3.2, MPI-1 part #--- ... #--- # Benchmarking PingPong # #processes = 2 #--- #bytes #repetitions t[usec] Mbytes/sec 0 100057.31 0.00 1 100057.71 0.02 2 100057.73 0.03 4 100057.81 0.07 8 100057.78 0.13 - With an optimized build, it shaves off only a few us (which isn't too important in this case, but it does matter in the low-latency transport cases): - #--- # Benchmarking PingPong # #processes = 2 #--- #bytes #repetitions t[usec] Mbytes/sec 0 100054.62 0.00 1 100054.92 0.02 2 100055.15 0.03 4 100055.16 0.07 8 100055.15 0.14 - Hi, I think these models are reasonably new :) The result I gave you, they are tested on 2 processes but on 2 different servers. I get that the result you showed is 2 processes on one machine? But I did build with debug enabled, I will try optimize then :) BTW, I forgot to tell you about SM & TIPC. Unfortunately, TIPC does not beat SM... /Xin
Re: [OMPI devel] TIPC BTL code ready for review
On 08/23/2011 04:35 PM, Jeff Squyres wrote: On Aug 23, 2011, at 9:54 AM, Xin He wrote: Hi, I modified the code, copyright comments added. I added your fixes to https://bitbucket.org/jsquyres/ompi-tipc. And about configure.m4, sorry I was not clear before, tipc.h is under /usr/include/linux/tipc.h, not under include directly. Can you edit your configure.m4 directly and test it and whatnot? I provided the configure.m4 as a starting point for you. :-) It shouldn't be hard to make it check linux/tipc.h instead of tipc.h. I'm happy to give you direct write access to the bitbucket, if you want it. I think me having write access is convenient for both of us :) As we've discussed off-list, we can't take the code upstream until the contributor agreement is signed, unfortunately. The agreement thing is ongoing right now, but it may take some time. But to save time, can you guys do some test on TIPC BTL, so that when the agreement is ready, the code can be used? I have done some tests using tools like NetPIPE, osu and IMB and the result shows that TIPC BTL has a better performance than TCP BTL. Great! Can you share any results? Yes, please check the appendix for the results using IMB 3.2. I have done the tests on 2 computers. Dell SC1435 Dual-Core AMD Opteron(tm) Processor 2212 HE x 2 4 GB Mem Ubuntu Server 10.04 LTS 32-bit Linux 2.6.32-24 Have you been able to compare it to the sm BTL? imb_result.tar Description: Unix tar archive
Re: [OMPI devel] TIPC BTL code ready for review
Hi, I modified the code, copyright comments added. And about configure.m4, sorry I was not clear before, tipc.h is under /usr/include/linux/tipc.h, not under include directly. I have done some tests using tools like NetPIPE, osu and IMB and the result shows that TIPC BTL has a better performance than TCP BTL. /Xin On 08/17/2011 04:23 PM, Jeff Squyres wrote: Ok. For the moment, you might want to leave the priority alone and see how it goes. You can always manually turn off the SM BTL to test performance with and without it. If it turns out to be better than the SM BTL, we can play the priority tricks. On Aug 17, 2011, at 10:09 AM, Xin He wrote: No there is no library that must be linked to. :-) About the performance compared to SM, I have not tested that yet. So far, I compared it with TCP. It has better performances under some circumstances, not all. Now I am working with profiling tools, hope to find reasons and improve it. /Xin On 08/17/2011 04:04 PM, Jeff Squyres wrote: BTW, is there a libtipc that must be linked against? If so, can you give me a symbol name to check for in there? On Aug 17, 2011, at 9:53 AM, Jeff Squyres wrote: I put it here: https://bitbucket.org/jsquyres/ompi-tipc/overview You can clone that repo with the Mercurial distributed version control tool. I'll add a configure.m4 shortly; possibly today. You can test it for me. :-) For the SM stuff, perhaps TIPC should just have a higher priority than the SM BTL -- that would naturally rank it above SM. Is TIPC's same-node performance better than SM's? On Aug 17, 2011, at 9:36 AM, Xin He wrote: It is a single component. And could someone write a configure file for me? structure sockaddr_tipc (defined in tipc.h)is a good sign we have tipc. And also TIPC cannot use with SM component, because TIPC use shared memory as well for communication between processes on the same node. Please kindly check the appendix. Thank you. /Xin On 08/17/2011 03:15 PM, Jeff Squyres wrote: Is your code self-contained in a single component? If it's a small (compressed) tarball, just send it to the list. Otherwise, you might want to post it somewhere like bitbucket.org where people can download and look at it. On Aug 17, 2011, at 4:00 AM, Xin He wrote: Hi developers, I have ran TIPC BTL component with the tools that recommended. After fixing some major bugs, I think the code is ready to be reviewed. I understand that a form has to be signed before OMPI can accept code. My organization is preparing that and soon a form will be sent. But in the meantime, can someone review my code please? Where should I send to? Thank you. Best regards, Xin ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel * ** ** ** WARNING: This email contains an attachment of a very suspicious type. ** ** You are urged NOT to open this attachment unless you are absolutely ** ** sure it is legitimate. Opening this attachment may cause irreparable ** ** damage to your computer and your files. If you have any questions ** ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. ** ** ** ** This warning was added by the IU Computer Science Dept. mail scanner. ** * ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel * ** ** ** WARNING: This email contains an attachment of a very suspicious type. ** ** You are urged NOT to open this attachment unless you are absolutely ** ** sure it is legitimate. Opening this attachment may cause irreparable ** ** damage to your computer and your files. If you have any questions ** ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. ** ** ** ** This warning was added by the IU Computer Science Dept. mail scanner. ** * <>
Re: [OMPI devel] TIPC BTL code ready for review
No there is no library that must be linked to. :-) About the performance compared to SM, I have not tested that yet. So far, I compared it with TCP. It has better performances under some circumstances, not all. Now I am working with profiling tools, hope to find reasons and improve it. /Xin On 08/17/2011 04:04 PM, Jeff Squyres wrote: BTW, is there a libtipc that must be linked against? If so, can you give me a symbol name to check for in there? On Aug 17, 2011, at 9:53 AM, Jeff Squyres wrote: I put it here: https://bitbucket.org/jsquyres/ompi-tipc/overview You can clone that repo with the Mercurial distributed version control tool. I'll add a configure.m4 shortly; possibly today. You can test it for me. :-) For the SM stuff, perhaps TIPC should just have a higher priority than the SM BTL -- that would naturally rank it above SM. Is TIPC's same-node performance better than SM's? On Aug 17, 2011, at 9:36 AM, Xin He wrote: It is a single component. And could someone write a configure file for me? structure sockaddr_tipc (defined in tipc.h)is a good sign we have tipc. And also TIPC cannot use with SM component, because TIPC use shared memory as well for communication between processes on the same node. Please kindly check the appendix. Thank you. /Xin On 08/17/2011 03:15 PM, Jeff Squyres wrote: Is your code self-contained in a single component? If it's a small (compressed) tarball, just send it to the list. Otherwise, you might want to post it somewhere like bitbucket.org where people can download and look at it. On Aug 17, 2011, at 4:00 AM, Xin He wrote: Hi developers, I have ran TIPC BTL component with the tools that recommended. After fixing some major bugs, I think the code is ready to be reviewed. I understand that a form has to be signed before OMPI can accept code. My organization is preparing that and soon a form will be sent. But in the meantime, can someone review my code please? Where should I send to? Thank you. Best regards, Xin ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel * ** ** ** WARNING: This email contains an attachment of a very suspicious type. ** ** You are urged NOT to open this attachment unless you are absolutely ** ** sure it is legitimate. Opening this attachment may cause irreparable ** ** damage to your computer and your files. If you have any questions ** ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. ** ** ** ** This warning was added by the IU Computer Science Dept. mail scanner. ** * ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] TIPC BTL code ready for review
It is a single component. And could someone write a configure file for me? structure sockaddr_tipc (defined in tipc.h)is a good sign we have tipc. And also TIPC cannot use with SM component, because TIPC use shared memory as well for communication between processes on the same node. Please kindly check the appendix. Thank you. /Xin On 08/17/2011 03:15 PM, Jeff Squyres wrote: Is your code self-contained in a single component? If it's a small (compressed) tarball, just send it to the list. Otherwise, you might want to post it somewhere like bitbucket.org where people can download and look at it. On Aug 17, 2011, at 4:00 AM, Xin He wrote: Hi developers, I have ran TIPC BTL component with the tools that recommended. After fixing some major bugs, I think the code is ready to be reviewed. I understand that a form has to be signed before OMPI can accept code. My organization is preparing that and soon a form will be sent. But in the meantime, can someone review my code please? Where should I send to? Thank you. Best regards, Xin ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel * ** ** ** WARNING: This email contains an attachment of a very suspicious type. ** ** You are urged NOT to open this attachment unless you are absolutely ** ** sure it is legitimate. Opening this attachment may cause irreparable ** ** damage to your computer and your files. If you have any questions ** ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. ** ** ** ** This warning was added by the IU Computer Science Dept. mail scanner. ** * <>
[OMPI devel] TIPC BTL code ready for review
Hi developers, I have ran TIPC BTL component with the tools that recommended. After fixing some major bugs, I think the code is ready to be reviewed. I understand that a form has to be signed before OMPI can accept code. My organization is preparing that and soon a form will be sent. But in the meantime, can someone review my code please? Where should I send to? Thank you. Best regards, Xin
Re: [OMPI devel] [TIPC BTL] test programmes
thank you all for replying me and giving me useful suggestions. I know where to start now. :-) /Xin On 08/02/2011 12:03 AM, Eugene Loh wrote: NAS Parallel Benchmarks are self-verifying. Another option is the MPI Testing Tool http://www.open-mpi.org/projects/mtt/ but it might be more trouble than it's worth. (INCIDENTALLY, THERE ARE TRAC TROUBLES WITH THE THREE LINKS AT THE BOTTOM OF THAT PAGE! COULD SOMEONE TAKE A LOOK?) If you do decide to explore MTT, http://www.open-mpi.org/projects/mtt/svn.php tells you how to do a Subversion checkout. It's a test harness. For the tests themselves, look in mtt/trunk/samples/*-template.ini for examples of what tests to run. Whether you want to pursue this route depends on whether you're serious about doing lots of testing. On 08/01/11 17:13, Jeff Squyres wrote: Additionally, you might want to download an run a bunch of common MPI benchmarks, such as: - Netpipe - Intel MPI Benchmarks (IMB) - SKaMPI - HPL (Linpack) - ...etc. On Aug 1, 2011, at 8:12 AM, Chris Samuel wrote: On Mon, 1 Aug 2011 09:47:00 PM Xin He wrote: Do any of you guys have any testing programs that I should run to test if it really works? How about a real MPI program which has test data to check it's running OK ? Gromacs is open source and has a self-test mechanism run via "make test" IIRC. I think HPL (Linpack) also checks the data from its run.. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] [TIPC BTL] test programmes
Hi all, I have finished the development of TIPC BTL component. It can pass all sample programs that Open MPI has within the package. Do any of you guys have any testing programs that I should run to test if it really works? Thank you. Best regards, Xin
Re: [OMPI devel] TIPC BTL Segmentation fault
ame (stacktrace.c:348) ==30850==by 0x5DB1B3F: ??? (in /lib/libpthread-2.12.1.so) ==30850==by 0xDEAFBEEDDEAFBEEC: ??? ==30850==by 0x50151F1: opal_list_construct (opal_list.c:88) ==30850==by 0xA8A49F1: opal_obj_run_constructors (opal_object.h:427) ==30850==by 0xA8A4E59: mca_pml_ob1_comm_construct (pml_ob1_comm.c:56) ==30850==by 0xA8A1385: opal_obj_run_constructors (opal_object.h:427) ==30850==by 0xA8A149F: opal_obj_new (opal_object.h:477) ==30850== Address 0xdeafbeeddeafbeed is not stack'd, malloc'd or (recently) free'd ==30850== ==30850== ==30850== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==30850== General Protection Fault ==30850==at 0xA011FDB: ??? (in /lib/libgcc_s.so.1) ==30850==by 0xA012B0B: _Unwind_Backtrace (in /lib/libgcc_s.so.1) ==30850==by 0x60BE69D: backtrace (backtrace.c:91) ==30850==by 0x4FAB055: opal_backtrace_buffer (backtrace_execinfo.c:54) ==30850==by 0x5026DF3: show_stackframe (stacktrace.c:348) ==30850==by 0x5DB1B3F: ??? (in /lib/libpthread-2.12.1.so) ==30850==by 0xDEAFBEEDDEAFBEEC: ??? ==30850==by 0x50151F1: opal_list_construct (opal_list.c:88) ==30850==by 0xA8A49F1: opal_obj_run_constructors (opal_object.h:427) ==30850==by 0xA8A4E59: mca_pml_ob1_comm_construct (pml_ob1_comm.c:56) ==30850==by 0xA8A1385: opal_obj_run_constructors (opal_object.h:427) ==30850==by 0xA8A149F: opal_obj_new (opal_object.h:477) ==30849== LEAK SUMMARY: ==30849==definitely lost: 453 bytes in 13 blocks ==30849==indirectly lost: 7,440 bytes in 12 blocks ==30849== possibly lost: 0 bytes in 0 blocks ==30849==still reachable: 2,331,071 bytes in 3,188 blocks ==30849== suppressed: 0 bytes in 0 blocks ==30849== Rerun with --leak-check=full to see details of leaked memory ==30849== ==30849== For counts of detected and suppressed errors, rerun with: -v ==30849== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4) On 07/04/2011 01:51 PM, Jeff Squyres wrote: Keep in mind, too, that opal_object is the "base" object -- put in C++ terms, it's the abstract class that all other classes are made of. So it's rare that we could create a opal_object by itself. opal_objects are usually created as part of some other, higher-level object. What's the full call stack of where Valgrind is showing the error? Make sure you have the most recent valgrind (www.valgrind.org); the versions that ship in various distros may be somewhat old. Newer valgrind versions show lots of things that older versions don't. A new valgrind *might* be able to show some prior memory fault that is causing the issue...? On Jul 4, 2011, at 7:45 AM, Xin He wrote: Hi, I ran the program with valgrind, and it showed almost the same error. It appeared that the segmentation fault happened during the initiation of an opal_object. That's why it puzzled me. /Xin On 07/04/2011 01:40 PM, Jeff Squyres wrote: Ah -- so this is in the template code. I suspect this code might have bit rotted a bit. :-\ If you run this through valgrind, does anything obvious show up? I ask because this kind of error is typically a symptom of the real error. I.e., the real error was some kind of memory corruption that occurred earlier, and this is the memory access that exposes that prior memory corruption. On Jul 4, 2011, at 5:08 AM, Xin He wrote: Yes, it is a opal_object. And this error seems to be caused by these code: void mca_btl_template_proc_construct(mca_btl_template_proc_t* template_proc){ ... . /* add to list of all proc instance */ OPAL_THREAD_LOCK(_btl_template_component.template_lock); opal_list_append(_btl_template_component.template_procs,_proc->super); OPAL_THREAD_UNLOCK(_btl_template_component.template_lock); } /Xin On 07/02/2011 10:49 PM, Jeff Squyres (jsquyres) wrote: Do u know which object it is that is being constructed? When you compile with debugging enabled, theres strings in the object struct that identify te file and line where the obj was created. Sent from my phone. No type good. On Jun 29, 2011, at 8:48 AM, "Xin He" <xin.i...@ericsson.com> wrote: Hi, As I advanced in my implementation of TIPC BTL, I added the component and tried to run hello_c program to test. Then I got this segmentation fault. It seemed happening after the call "mca_btl_tipc_add_procs". The error message displayed: [oak:23192] *** Process received signal *** [oak:23192] Signal: Segmentation fault (11) [oak:23192] Signal code: (128) [oak:23192] Failing at address: (nil) [oak:23192] [ 0] /lib/libpthread.so.0(+0xfb40) [0x7fec2a40fb40] [oak:23192] [ 1] /usr/lib/libmpi.so.0(+0x1e6c10) [0x7fec2b2afc10] [oak:23192] [ 2] /usr/lib/libmpi.so.0(+0x1e71f2) [0x7fec2b2b01f2] [oak:23192] [ 3] /usr/lib/openmpi/mca_pml_ob1.so(+0x59f2) [0x7fec264fc9f2] [oak:23192] [ 4] /usr/lib/openmpi/mca_pml_ob1.so(+0x5e5a) [0x7fec264fce5a] [oak:23192] [
Re: [OMPI devel] TIPC BTL Segmentation fault
Yes, it is a opal_object. And this error seems to be caused by these code: void mca_btl_template_proc_construct(mca_btl_template_proc_t* template_proc){ ... . /* add to list of all proc instance */ OPAL_THREAD_LOCK(_btl_template_component.template_lock); opal_list_append(_btl_template_component.template_procs, _proc->super); OPAL_THREAD_UNLOCK(_btl_template_component.template_lock); } /Xin On 07/02/2011 10:49 PM, Jeff Squyres (jsquyres) wrote: Do u know which object it is that is being constructed? When you compile with debugging enabled, theres strings in the object struct that identify te file and line where the obj was created. Sent from my phone. No type good. On Jun 29, 2011, at 8:48 AM, "Xin He"<xin.i...@ericsson.com> wrote: Hi, As I advanced in my implementation of TIPC BTL, I added the component and tried to run hello_c program to test. Then I got this segmentation fault. It seemed happening after the call "mca_btl_tipc_add_procs". The error message displayed: [oak:23192] *** Process received signal *** [oak:23192] Signal: Segmentation fault (11) [oak:23192] Signal code: (128) [oak:23192] Failing at address: (nil) [oak:23192] [ 0] /lib/libpthread.so.0(+0xfb40) [0x7fec2a40fb40] [oak:23192] [ 1] /usr/lib/libmpi.so.0(+0x1e6c10) [0x7fec2b2afc10] [oak:23192] [ 2] /usr/lib/libmpi.so.0(+0x1e71f2) [0x7fec2b2b01f2] [oak:23192] [ 3] /usr/lib/openmpi/mca_pml_ob1.so(+0x59f2) [0x7fec264fc9f2] [oak:23192] [ 4] /usr/lib/openmpi/mca_pml_ob1.so(+0x5e5a) [0x7fec264fce5a] [oak:23192] [ 5] /usr/lib/openmpi/mca_pml_ob1.so(+0x2386) [0x7fec264f9386] [oak:23192] [ 6] /usr/lib/openmpi/mca_pml_ob1.so(+0x24a0) [0x7fec264f94a0] [oak:23192] [ 7] /usr/lib/openmpi/mca_pml_ob1.so(+0x22fb) [0x7fec264f92fb] [oak:23192] [ 8] /usr/lib/openmpi/mca_pml_ob1.so(+0x3a60) [0x7fec264faa60] [oak:23192] [ 9] /usr/lib/libmpi.so.0(+0x67f51) [0x7fec2b130f51] [oak:23192] [10] /usr/lib/libmpi.so.0(MPI_Init+0x173) [0x7fec2b161c33] [oak:23192] [11] hello_i(main+0x22) [0x400936] [oak:23192] [12] /lib/libc.so.6(__libc_start_main+0xfe) [0x7fec2a09bd8e] [oak:23192] [13] hello_i() [0x400859] [oak:23192] *** End of error message *** I used gdb to check the stack: (gdb) bt #0 0x77afac10 in opal_obj_run_constructors (object=0x6ca980) at ../opal/class/opal_object.h:427 #1 0x77afb1f2 in opal_list_construct (list=0x6ca958) at class/opal_list.c:88 #2 0x72d479f2 in opal_obj_run_constructors (object=0x6ca958) at ../../../../opal/class/opal_object.h:427 #3 0x72d47e5a in mca_pml_ob1_comm_construct (comm=0x6ca8c0) at pml_ob1_comm.c:55 #4 0x72d44386 in opal_obj_run_constructors (object=0x6ca8c0) at ../../../../opal/class/opal_object.h:427 #5 0x72d444a0 in opal_obj_new (cls=0x72f6c040) at ../../../../opal/class/opal_object.h:477 #6 0x72d442fb in opal_obj_new_debug (type=0x72f6c040, file=0x72d62840 "pml_ob1.c", line=182) at ../../../../opal/class/opal_object.h:252 #7 0x72d45a60 in mca_pml_ob1_add_comm (comm=0x601060) at pml_ob1.c:182 #8 0x7797bf51 in ompi_mpi_init (argc=1, argv=0x7fffdf58, requested=0, provided=0x7fffde28) at runtime/ompi_mpi_init.c:770 #9 0x779acc33 in PMPI_Init (argc=0x7fffde5c, argv=0x7fffde50) at pinit.c:84 #10 0x00400936 in main (argc=1, argv=0x7fffdf58) at hello_c.c:17 It seems the error happened when an object is constructed. Any idea why this is happening? Thanks. Best regards, Xin ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] TIPC BTL Segmentation fault
Hi, As I advanced in my implementation of TIPC BTL, I added the component and tried to run hello_c program to test. Then I got this segmentation fault. It seemed happening after the call "mca_btl_tipc_add_procs". The error message displayed: [oak:23192] *** Process received signal *** [oak:23192] Signal: Segmentation fault (11) [oak:23192] Signal code: (128) [oak:23192] Failing at address: (nil) [oak:23192] [ 0] /lib/libpthread.so.0(+0xfb40) [0x7fec2a40fb40] [oak:23192] [ 1] /usr/lib/libmpi.so.0(+0x1e6c10) [0x7fec2b2afc10] [oak:23192] [ 2] /usr/lib/libmpi.so.0(+0x1e71f2) [0x7fec2b2b01f2] [oak:23192] [ 3] /usr/lib/openmpi/mca_pml_ob1.so(+0x59f2) [0x7fec264fc9f2] [oak:23192] [ 4] /usr/lib/openmpi/mca_pml_ob1.so(+0x5e5a) [0x7fec264fce5a] [oak:23192] [ 5] /usr/lib/openmpi/mca_pml_ob1.so(+0x2386) [0x7fec264f9386] [oak:23192] [ 6] /usr/lib/openmpi/mca_pml_ob1.so(+0x24a0) [0x7fec264f94a0] [oak:23192] [ 7] /usr/lib/openmpi/mca_pml_ob1.so(+0x22fb) [0x7fec264f92fb] [oak:23192] [ 8] /usr/lib/openmpi/mca_pml_ob1.so(+0x3a60) [0x7fec264faa60] [oak:23192] [ 9] /usr/lib/libmpi.so.0(+0x67f51) [0x7fec2b130f51] [oak:23192] [10] /usr/lib/libmpi.so.0(MPI_Init+0x173) [0x7fec2b161c33] [oak:23192] [11] hello_i(main+0x22) [0x400936] [oak:23192] [12] /lib/libc.so.6(__libc_start_main+0xfe) [0x7fec2a09bd8e] [oak:23192] [13] hello_i() [0x400859] [oak:23192] *** End of error message *** I used gdb to check the stack: (gdb) bt #0 0x77afac10 in opal_obj_run_constructors (object=0x6ca980) at ../opal/class/opal_object.h:427 #1 0x77afb1f2 in opal_list_construct (list=0x6ca958) at class/opal_list.c:88 #2 0x72d479f2 in opal_obj_run_constructors (object=0x6ca958) at ../../../../opal/class/opal_object.h:427 #3 0x72d47e5a in mca_pml_ob1_comm_construct (comm=0x6ca8c0) at pml_ob1_comm.c:55 #4 0x72d44386 in opal_obj_run_constructors (object=0x6ca8c0) at ../../../../opal/class/opal_object.h:427 #5 0x72d444a0 in opal_obj_new (cls=0x72f6c040) at ../../../../opal/class/opal_object.h:477 #6 0x72d442fb in opal_obj_new_debug (type=0x72f6c040, file=0x72d62840 "pml_ob1.c", line=182) at ../../../../opal/class/opal_object.h:252 #7 0x72d45a60 in mca_pml_ob1_add_comm (comm=0x601060) at pml_ob1.c:182 #8 0x7797bf51 in ompi_mpi_init (argc=1, argv=0x7fffdf58, requested=0, provided=0x7fffde28) at runtime/ompi_mpi_init.c:770 #9 0x779acc33 in PMPI_Init (argc=0x7fffde5c, argv=0x7fffde50) at pinit.c:84 #10 0x00400936 in main (argc=1, argv=0x7fffdf58) at hello_c.c:17 It seems the error happened when an object is constructed. Any idea why this is happening? Thanks. Best regards, Xin
Re: [OMPI devel] Compiling problem in trunk?
Strangely, as I re-downloaded everything and built from scratch again, there was no error this time. On 06/27/2011 04:32 PM, Jeff Squyres wrote: Actually, can you send all the information listed here: http://www.open-mpi.org/community/help/ On Jun 27, 2011, at 10:04 AM, Xin He wrote: Hi, the appendix is my config.log. Hope it will be helpful. Regards, Xin On 06/27/2011 03:22 PM, Josh Hursey wrote: I tried a fresh checkout of the trunk this morning (r24823) and could not reproduce with that configure string on a Linux 2.6.18-238.12.1.el5 x86_64 machine. Can you send a zip'ed up copy of your config.log? That may help us highlight any other environment differences. -- Josh On Mon, Jun 27, 2011 at 5:01 AM, Xin He<xin.i...@ericsson.com> wrote: Hi, I even tried re-downloading the whole project and did all things. First autogen, then ./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio --prefix=/usr --enable-heterogeneous. It was during "make" those messages displayed. I'm using Ubuntu 64bit. /Xin On 06/23/2011 05:49 PM, Jeff Squyres wrote: Xin -- Can you provide more details on exactly what part of the build is failing? None of the rest of us are seeing the problem. When you svn up'ed, did you re-run autogen.pl / configure? On Jun 23, 2011, at 9:04 AM, Xin He wrote: Thanks for the tips about configuration. Yet the build still failed. Anyway, I managed to roll back to an earlier version and successfully installed :) /Xin On 06/23/2011 01:26 PM, Jeff Squyres wrote: I don't believe we have changed anything in the trunk w.r.t. the Fortran 90 stuff (there's stuff off in a branch waiting to come in, but I don't think it has come in). Since you're primarily working on a new BTL, you might want to speed up your configure/build process by disabling Fortran and other optional stuff. Try: ./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio ... That should speed things up a bit, and also avoid whatever this Fortran problem is. On Jun 23, 2011, at 7:23 AM, Xin He wrote: Hi, as I compiled the sources from "trunk". I got these error messages when doing make: [blablabla...] make all-am make[3]: Entering directory `/home/ehhexxn/git/ompi/ompi/include' FC mpif90-ext.lo libtool: compile: unrecognized option `-c' libtool: compile: Try `libtool --help' for more information. make[3]: *** [mpif90-ext.lo] Error 1 make[3]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[2]: *** [all] Error 2 make[2]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/ehhexxn/git/ompi/ompi' make: *** [all-recursive] Error 1 I was able to compile an earlier version of trunk. Best regards, Xin ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel * ** ** ** WARNING: This email contains an attachment of a very suspicious type. ** ** You are urged NOT to open this attachment unless you are absolutely ** ** sure it is legitimate. Opening this attachment may cause irreparable ** ** damage to your computer and your files. If you have any questions ** ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. ** ** ** ** This warning was added by the IU Computer Science Dept. mail scanner. ** * ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Compiling problem in trunk?
Hi, the appendix is my config.log. Hope it will be helpful. Regards, Xin On 06/27/2011 03:22 PM, Josh Hursey wrote: I tried a fresh checkout of the trunk this morning (r24823) and could not reproduce with that configure string on a Linux 2.6.18-238.12.1.el5 x86_64 machine. Can you send a zip'ed up copy of your config.log? That may help us highlight any other environment differences. -- Josh On Mon, Jun 27, 2011 at 5:01 AM, Xin He<xin.i...@ericsson.com> wrote: Hi, I even tried re-downloading the whole project and did all things. First autogen, then ./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio --prefix=/usr --enable-heterogeneous. It was during "make" those messages displayed. I'm using Ubuntu 64bit. /Xin On 06/23/2011 05:49 PM, Jeff Squyres wrote: Xin -- Can you provide more details on exactly what part of the build is failing? None of the rest of us are seeing the problem. When you svn up'ed, did you re-run autogen.pl / configure? On Jun 23, 2011, at 9:04 AM, Xin He wrote: Thanks for the tips about configuration. Yet the build still failed. Anyway, I managed to roll back to an earlier version and successfully installed :) /Xin On 06/23/2011 01:26 PM, Jeff Squyres wrote: I don't believe we have changed anything in the trunk w.r.t. the Fortran 90 stuff (there's stuff off in a branch waiting to come in, but I don't think it has come in). Since you're primarily working on a new BTL, you might want to speed up your configure/build process by disabling Fortran and other optional stuff. Try: ./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio ... That should speed things up a bit, and also avoid whatever this Fortran problem is. On Jun 23, 2011, at 7:23 AM, Xin He wrote: Hi, as I compiled the sources from "trunk". I got these error messages when doing make: [blablabla...] make all-am make[3]: Entering directory `/home/ehhexxn/git/ompi/ompi/include' FC mpif90-ext.lo libtool: compile: unrecognized option `-c' libtool: compile: Try `libtool --help' for more information. make[3]: *** [mpif90-ext.lo] Error 1 make[3]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[2]: *** [all] Error 2 make[2]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/ehhexxn/git/ompi/ompi' make: *** [all-recursive] Error 1 I was able to compile an earlier version of trunk. Best regards, Xin ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel * ** ** ** WARNING: This email contains an attachment of a very suspicious type. ** ** You are urged NOT to open this attachment unless you are absolutely ** ** sure it is legitimate. Opening this attachment may cause irreparable ** ** damage to your computer and your files. If you have any questions ** ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. ** ** ** ** This warning was added by the IU Computer Science Dept. mail scanner. ** * <>
Re: [OMPI devel] Compiling problem in trunk?
Hi, I even tried re-downloading the whole project and did all things. First autogen, then ./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio --prefix=/usr --enable-heterogeneous. It was during "make" those messages displayed. I'm using Ubuntu 64bit. /Xin On 06/23/2011 05:49 PM, Jeff Squyres wrote: Xin -- Can you provide more details on exactly what part of the build is failing? None of the rest of us are seeing the problem. When you svn up'ed, did you re-run autogen.pl / configure? On Jun 23, 2011, at 9:04 AM, Xin He wrote: Thanks for the tips about configuration. Yet the build still failed. Anyway, I managed to roll back to an earlier version and successfully installed :) /Xin On 06/23/2011 01:26 PM, Jeff Squyres wrote: I don't believe we have changed anything in the trunk w.r.t. the Fortran 90 stuff (there's stuff off in a branch waiting to come in, but I don't think it has come in). Since you're primarily working on a new BTL, you might want to speed up your configure/build process by disabling Fortran and other optional stuff. Try: ./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio ... That should speed things up a bit, and also avoid whatever this Fortran problem is. On Jun 23, 2011, at 7:23 AM, Xin He wrote: Hi, as I compiled the sources from "trunk". I got these error messages when doing make: [blablabla...] make all-am make[3]: Entering directory `/home/ehhexxn/git/ompi/ompi/include' FC mpif90-ext.lo libtool: compile: unrecognized option `-c' libtool: compile: Try `libtool --help' for more information. make[3]: *** [mpif90-ext.lo] Error 1 make[3]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[2]: *** [all] Error 2 make[2]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/ehhexxn/git/ompi/ompi' make: *** [all-recursive] Error 1 I was able to compile an earlier version of trunk. Best regards, Xin ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Compiling problem in trunk?
Thanks for the tips about configuration. Yet the build still failed. Anyway, I managed to roll back to an earlier version and successfully installed :) /Xin On 06/23/2011 01:26 PM, Jeff Squyres wrote: I don't believe we have changed anything in the trunk w.r.t. the Fortran 90 stuff (there's stuff off in a branch waiting to come in, but I don't think it has come in). Since you're primarily working on a new BTL, you might want to speed up your configure/build process by disabling Fortran and other optional stuff. Try: ./configure --disable-mpi-f90 --disable-mpi-f77 --disable-mpi-cxx --disable-vt --disable-io-romio ... That should speed things up a bit, and also avoid whatever this Fortran problem is. On Jun 23, 2011, at 7:23 AM, Xin He wrote: Hi, as I compiled the sources from "trunk". I got these error messages when doing make: [blablabla...] make all-am make[3]: Entering directory `/home/ehhexxn/git/ompi/ompi/include' FC mpif90-ext.lo libtool: compile: unrecognized option `-c' libtool: compile: Try `libtool --help' for more information. make[3]: *** [mpif90-ext.lo] Error 1 make[3]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[2]: *** [all] Error 2 make[2]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/ehhexxn/git/ompi/ompi' make: *** [all-recursive] Error 1 I was able to compile an earlier version of trunk. Best regards, Xin ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Compiling problem in trunk?
Hi, as I compiled the sources from "trunk". I got these error messages when doing make: [blablabla...] make all-am make[3]: Entering directory `/home/ehhexxn/git/ompi/ompi/include' FC mpif90-ext.lo libtool: compile: unrecognized option `-c' libtool: compile: Try `libtool --help' for more information. make[3]: *** [mpif90-ext.lo] Error 1 make[3]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[2]: *** [all] Error 2 make[2]: Leaving directory `/home/ehhexxn/git/ompi/ompi/include' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/ehhexxn/git/ompi/ompi' make: *** [all-recursive] Error 1 I was able to compile an earlier version of trunk. Best regards, Xin
Re: [OMPI devel] Open-MPI on TIPC
Thank your for replying. Now I have read through the documents mentioned and created the component "tipc" and successfully built a library of it (the content is empty, of course). So to advance the work, I will need to actually implement the library. And I notice btl.h and a "template" folder, which seems to demonstrate the structure of a typical btl component. However, I find myself not understand the code quite well. Would you please explain the structure a little bit, like what the files are for? Like I understand that btl_template.h is to define interfaces for export, but what other files(endpoint, pro, frag) for? And why such structure? Thank you in advance for your kind explanation. Regards, Xin Probably the best docs to check would be what were referred to in that thread, and http://www.open-mpi.org/papers/ppam-2005/ for an overview. Read through ompi/mca/pml/pml.h. It's the interface for the MPI "engine" behind OMPI's point-to-point functions, like MPI_SEND and MPI_RECV and friends. The PML uses BTLs to perform all the transport-level operations (E.g., over a specific type of network and/or protocol). BTLs are dumb byte-pushers; they have no concept of MPI semantics -- all the MPI semantics are handled in the upper-level PML. BTLs are also not allowed to block; the PML will poll them when necessary. Look though ompi/mca/btl/btl.h for a description of the BTL interface that BTLs are expected to export. Also have a look at the following wiki pages: https://svn.open-mpi.org/trac/ompi/wiki/NewDeveloper https://svn.open-mpi.org/trac/ompi/wiki/UsingMercurial (same principles apply to git or any other DVCS) https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogen https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent https://svn.open-mpi.org/trac/ompi/wiki/BTLSemantics On Jun 13, 2011, at 4:39 AM, Xin He I wrote: > Hi, > > I just started working on adding a BTL module of TIPC (Transparent > Inter-process Communication) for Open-mpi. > > My coworker post this topic a year ago : > http://www.open-mpi.org/community/lists/devel/2010/05/7914.php > > I read the thread. I am wondering if someone could provide the documents > mentioned. A few unofficial documents or explanation > of how to add a BTL module will be of great help to me :) > > Regards, > Xin > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Open-MPI on TIPC
Hi, I just started working on adding a BTL module of TIPC (Transparent Inter-process Communication) for Open-mpi. My coworker post this topic a year ago : http://www.open-mpi.org/community/lists/devel/2010/05/7914.php I read the thread. I am wondering if someone could provide the documents mentioned. A few unofficial documents or explanation of how to add a BTL module will be of great help to me :) Regards, Xin