[computer-go] Re: A cluster version of Zen is running on cgos 19x19
Hello Hideki, Ingo, now Zengg19 is running in Computer Go room as a rank-free bot with 30 minutes sd. It's running on a (mini) cluster of four Intel quad-core handcraft computers. Thank you for that Christmas surprise. And Cluster-Zen's performance on cgos is impressive, indeed: http://cgos.boardspace.net/19x19/cross/Zengg19-4x4c.html http://cgos.boardspace.net/19x19/standings.html Ingo. -- Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 - sicherer, schneller und einfacher! http://portal.gmx.net/de/go/chbrowser ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Re: A cluster version of Zen is running on cgos 19x19
Ingo, now Zengg19 is running in Computer Go room as a rank-free bot with 30 minutes sd. It's running on a (mini) cluster of four Intel quad-core handcraft computers. Hideki Ingo Althöfer: 20091124200643.255...@gmx.net: Hideki replied: Do I have a Christmas wish for free already? It is: Let the cluster also run on KGS - against the humans. I'd like to do so but it's not allowed to connect the cluster to the Internet, sigh. Hmm. As CGOS is also Internet, it seems that Zen-author does not allow you to connect to KGS. Is Zen-Author reading here? Maybe, he can rethink about the possibility. I want Cluster-Zen for Christmas, Cluster-Zen-for Christmas, Cluster-Zen for Christmas, please, please, please, please... Little child In-Go. -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [SPAM] Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
In your (or Sylvain's?) recent paper, you wrote less than one second interval was useless. I've observed similar. I'm now evaluating the performance with 0.2, 0.4, 1 and 4 second intervals for 5 second per move setting on 19x19 board on 32 nodes of HA8000 cluster. Yes, one second is fine for 5 seconds per move. Maybe you can check if you have a linear speed-up if you artificially simulate a zero communication time ? My guess is that the communication time should not be a trouble, but if you don't use MPI, maybe there's something in your implementation of communications ? By the way, a cluster parallelization in MPI can be developped very quickly and MPI is efficient - mpi_all_reduce has a computational cost logarithmic in the number of nodes. Good luck, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [SPAM] Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
Olivier Teytaud: aa5e3c330911250005v1d434a5bj8a09067a620ef...@mail.gmail.com: In your (or Sylvain's?) recent paper, you wrote less than one second interval was useless. I've observed similar. I'm now evaluating the performance with 0.2, 0.4, 1 and 4 second intervals for 5 second per move setting on 19x19 board on 32 nodes of HA8000 cluster. Yes, one second is fine for 5 seconds per move. Maybe you can check if you have a linear speed-up if you artificially simulate a zero communication time ? My guess is that the communication time should not be a trouble, but if you don't use MPI, maybe there's something in your implementation of communications ? Hmm, I think my communication code is not a trouble. By the way, a cluster parallelization in MPI can be developped very quickly and MPI is efficient - mpi_all_reduce has a computational cost logarithmic in the number of nodes. Even if the sum-up is done in a logarithmic time (with binary tree style), the collecting time of all infomation from all nodes is proportional to the number of nodes if the master node has few communication ports, isn't it? MPI is a best choice for dedicated HPC clusters, I agree. It forces, however, several constraints such as each node cannot be unplugged or plugged during operation. MPI cannot be installed some computers with not-so-common operating systems or small computers with not enough memory, such as game cosoles. I just want freer parallel and distributed computing environment for MCTS than MPI. My code is now running on a mini pc cluster at my home. I don't want to install MPI to my computers :). By the way, have you experimented not averaging but just adding sceme? When I tested that my code had some bugs and no success. Good luck, Thanks, Hideki -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [SPAM] Re: [SPAM] Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
Even if the sum-up is done in a logarithmic time (with binary tree style), the collecting time of all infomation from all nodes is proportional to the number of nodes if the master node has few communication ports, isn't it? No (unless I misunderstood what you mean, sorry in that case!) ! Use a tree of nodes, to agregate informations, and everything is logarithmic. This is implicitly done in MPI. If you have 8 nodes A, B, C, D, E, F, G, H, then (i) first layer A and B send information to B C and D send information to D E and F send information to F G and H send information to H (ii) second layer B and D send information to D F and H send information to H (iii) third layer D and H send information to H then do the same in the reverse order so that the cumulated information is sent back to all nodes. By the way, have you experimented not averaging but just adding sceme? When I tested that my code had some bugs and no success. Yes, we have tested. Surprisingly, no significant difference. But I don't know if this would still hold today, as we have some pattern-based exploration. For a code with a score almost only depending on percentages, it's not surprising that averaging and summing are equivalent. Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [SPAM] Re: [SPAM] Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
Olivier Teytaud: aa5e3c330911250119x5e01fa32w2e5f3db68704d...@mail.gmail.com: Even if the sum-up is done in a logarithmic time (with binary tree style), the collecting time of all infomation from all nodes is proportional to the number of nodes if the master node has few communication ports, isn't it? No (unless I misunderstood what you mean, sorry in that case!) ! Use a tree of nodes, to agregate informations, and everything is logarithmic. This is implicitly done in MPI. If you have 8 nodes A, B, C, D, E, F, G, H, then (i) first layer A and B send information to B C and D send information to D E and F send information to F G and H send information to H (ii) second layer B and D send information to D F and H send information to H (iii) third layer D and H send information to H then do the same in the reverse order so that the cumulated information is sent back to all nodes. Interesting, surely the order is almost logarithmic. But how long it takes a packet to pass through a layer. I'm afraid the actual delay time may increase. By the way, have you experimented not averaging but just adding sceme? When I tested that my code had some bugs and no success. Yes, we have tested. Surprisingly, no significant difference. But I don't know if this would still hold today, as we have some pattern-based exploration. For a code with a score almost only depending on percentages, it's not surprising that averaging and summing are equivalent. Simple adding has an advantage that no synchronization to sum-up all statstical numbers of all computers is required and so the time from sending a statistics packet to receiving adding it to the root node will be reduced.This advantage, however, may not be effective in MPI environments because the number of packets inceases from N to N^2 if real (ie. using UDP) broadcasting is not used. It's not so surprising that there was no significant difference in MPI environments. Ah, if the tree structure is used to broadcast packets, things may vary. Thaks a lot, Hideki -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [SPAM] Re: [SPAM] Re: [SPAM] Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
Interesting, surely the order is almost logarithmic. But how long it takes a packet to pass through a layer. I'm afraid the actual delay time may increase. With gigabit ethernet my humble opinion is that you should have no problem. But, testing what happens if you artificially cancel the time of the messages might confirm/infirm this. If you have troubles due to the communication time, I'm sure you can optimize it. MPI provides plenty of well done primitives for encoding communications. Except if you need very precise optimization, it's not worth working directly with sockets. Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Re: A cluster version of Zen is running on cgos 19x19
Hideki Kato wrote: I'm now testing a cluster version of Zen (Zengg-4x4c-tst), developed by a joint project with Yamato, on cgos 19x19. It wons, however, all games (except first one with timeout due to a bug). Running more strong programs are very appreciated. Hideki, thx for your activity. Do I have a Christmas wish for free already? It is: Let the cluster also run on KGS - against the humans. Ingo. -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Re: A cluster version of Zen is running on cgos 19x19
Ingo Althöfer: 20091124190802.303...@gmx.net: Hideki Kato wrote: I'm now testing a cluster version of Zen (Zengg-4x4c-tst), developed by a joint project with Yamato, on cgos 19x19. It wons, however, all games (except first one with timeout due to a bug). Running more strong programs are very appreciated. Hideki, thx for your activity. Do I have a Christmas wish for free already? It is: Let the cluster also run on KGS - against the humans. I'd like to do so but it's not allowed to connect the cluster to the Internet, sigh. Hideki -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Re: A cluster version of Zen is running on cgos 19x19
Hideki replied: Do I have a Christmas wish for free already? It is: Let the cluster also run on KGS - against the humans. I'd like to do so but it's not allowed to connect the cluster to the Internet, sigh. Hmm. As CGOS is also Internet, it seems that Zen-author does not allow you to connect to KGS. Is Zen-Author reading here? Maybe, he can rethink about the possibility. I want Cluster-Zen for Christmas, Cluster-Zen-for Christmas, Cluster-Zen for Christmas, please, please, please, please... Little child In-Go. -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Re: A cluster version of Zen is running on cgos 19x19
Ingo Althöfer: 20091124200643.255...@gmx.net: Hideki replied: Do I have a Christmas wish for free already? It is: Let the cluster also run on KGS - against the humans. I'd like to do so but it's not allowed to connect the cluster to the Internet, sigh. Hmm. As CGOS is also Internet, it seems that Zen-author does not allow you to connect to KGS. Ah, I was confusing. I wrote about T2K HPC cluster, which is the main target of my development, not my home cluaster. My mini cluster can freely be connected to KGS, though I have no rated bot account yet. Is Zen-Author reading here? Maybe, he can rethink about the possibility. He is sleeping now 'cause it's 5:30 am in Japan :). I want Cluster-Zen for Christmas, Cluster-Zen-for Christmas, Cluster-Zen for Christmas, please, please, please, please... Little child In-Go. I'll throw it into KGS after tuning several parameters. Hideki -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Re: A cluster version of Zen is running on cgos 19x19
Hi Hideki, Is Zen-Author reading here? Maybe, he can rethink about the possibility. He is sleeping now 'cause it's 5:30 am in Japan :). Ok, let him his good sleep. I want Cluster-Zen for Christmas, Cluster-Zen-for Christmas, Cluster-Zen for Christmas, please, please, please, please... Little child In-Go. I'll throw it into KGS after tuning several parameters. You are a 100-%-darling. Thanks a lot in advance. Ingo. -- Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 - sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
In message 4b0c4522.370%hideki_ka...@ybb.ne.jp, Hideki Kato hideki_ka...@ybb.ne.jp writes Ingo Althöfer: 20091124200643.255...@gmx.net: Hideki replied: Do I have a Christmas wish for free already? It is: Let the cluster also run on KGS - against the humans. I'd like to do so but it's not allowed to connect the cluster to the Internet, sigh. Hmm. As CGOS is also Internet, it seems that Zen-author does not allow you to connect to KGS. Ah, I was confusing. I wrote about T2K HPC cluster, which is the main target of my development, not my home cluaster. My mini cluster can freely be connected to KGS, though I have no rated bot account yet. Is Zen-Author reading here? Maybe, he can rethink about the possibility. He is sleeping now 'cause it's 5:30 am in Japan :). I want Cluster-Zen for Christmas, Cluster-Zen-for Christmas, Cluster-Zen for Christmas, please, please, please, please... Little child In-Go. I'll throw it into KGS after tuning several parameters. The December KGS bot tournament will be 9x9. I guess that if a cluster-Zen competes in that (I am hoping it will), it will be unbeatable. The existing pattern of KGS bot tournaments (see http://www.weddslist.com/kgs/future.html) means that the January one will also be 9x9, then February and March will both be 19x19. A cluster Zen in a 19x19 event will be even more interesting to watch. Nick Hideki -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Nick Weddn...@maproom.co.uk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
Hi Nick, I'll perticipate comming tournaments as much as possible but it's still under development and needs much more work and time for full performance. Since my mini cluster uses usual Gigabit Ether, which is much slower than expensive Infiniband or such high speed network devices, it performs not so better on 9x9. So please do not expect much :). Also, on 19x19 board, current 16-core cluster version performs almost the same as 8-core shared memory pc such as Mac Pro, which Yamato used for KGS. Hideki Nick Wedd: x8jzsrck5edlf...@maproom.demon.co.uk: In message 4b0c4522.370%hideki_ka...@ybb.ne.jp, Hideki Kato hideki_ka...@ybb.ne.jp writes Ingo Althöfer: 20091124200643.255...@gmx.net: Hideki replied: Do I have a Christmas wish for free already? It is: Let the cluster also run on KGS - against the humans. I'd like to do so but it's not allowed to connect the cluster to the Internet, sigh. Hmm. As CGOS is also Internet, it seems that Zen-author does not allow you to connect to KGS. Ah, I was confusing. I wrote about T2K HPC cluster, which is the main target of my development, not my home cluaster. My mini cluster can freely be connected to KGS, though I have no rated bot account yet. Is Zen-Author reading here? Maybe, he can rethink about the possibility. He is sleeping now 'cause it's 5:30 am in Japan :). I want Cluster-Zen for Christmas, Cluster-Zen-for Christmas, Cluster-Zen for Christmas, please, please, please, please... Little child In-Go. I'll throw it into KGS after tuning several parameters. The December KGS bot tournament will be 9x9. I guess that if a cluster-Zen competes in that (I am hoping it will), it will be unbeatable. The existing pattern of KGS bot tournaments (see http://www.weddslist.com/kgs/future.html) means that the January one will also be 9x9, then February and March will both be 19x19. A cluster Zen in a 19x19 event will be even more interesting to watch. Nick Hideki -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
Also, on 19x19 board, current 16-core cluster version performs almost the same as 8-core shared memory pc such as Mac Pro, which Yamato used for KGS. Hi Hideki, Is that difference due to a scaling limit of Zen, or is this due to the cluster overhead? Would moving from gigabit to infiniband help, or is the limit more to do with the lack of shared memory? T2K HPC cluster This seems to be a cluster specification rather than an actual machine. Can you tell us more about how many cores you are experimenting with, and how the programs scale? (Are all your experiments with Zen, or are you trying to run other programs on a cluster too?) Darren -- Darren Cook, Software Researcher/Developer http://dcook.org/gobet/ (Shodan Go Bet - who will win?) http://dcook.org/mlsn/ (Multilingual open source semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
Darren Cook: 4b0c6706.7070...@dcook.org: Also, on 19x19 board, current 16-core cluster version performs almost the same as 8-core shared memory pc such as Mac Pro, which Yamato used for KGS. Hi Hideki, Is that difference due to a scaling limit of Zen, or is this due to the cluster overhead? Would moving from gigabit to infiniband help, or is the limit more to do with the lack of shared memory? I'm right now evaluating the scaling (:-). The performance gap is perhaps due to the algorithms. Almost all cluster versions of current strong programs (MoGo, MFG, Fuego and Zen) use root parallel while shared memory computers allow us to use thread parallelism, which gives better performance. The main reason, I guess, is that the latter increses the depth of the search tree according to the number of processors (cores) while the former does not. One interesting observed thing of root parallel is that the scaling depends on the time for a move; longer time setting shows better scalability, when the time period to exchange root information is fixed. In other words, each time setting has its best number of nodes. This makes things complicated :(. The scaling limit of Zen is still unknown, though I expected that the playouts of Zen was not so random that it did not scale well, before starting this joint project with Yamato. T2K HPC cluster This seems to be a cluster specification rather than an actual machine. Can you tell us more about how many cores you are experimenting with, and how the programs scale? (Are all your experiments with Zen, or are you trying to run other programs on a cluster too?) I'm running only Zen on the cluster, though I'd like to run my Fudo Go as well if I have (had?) time. Name: T2K Open Supercomputer (Todai) #Todai is an abbreviation of University of Tokyo in Japenese. Hardware: HITACHI HA8000-tc/RS425 Number of nodes: 952 Number of cores of each node: 16 #I can use up to 64 nodes; 1024 cores in total Processor: AMD Opteron 8356 (quad-core) 2.3 GHz Memory of each node: 32 GB Interconnect: Myricom Myri-10G Operating System: RedHat Enterprise Linux 5 #Flops numbers are omitted. :) http://www.cc.u-tokyo.ac.jp/service/ha8000/intro.html (in Japanese) T2K stands for Tokyo, Tsukuba and Kyoto (T, T, K). See http://www.open-supercomputer.org/ (in English) for the idea of T2K Open Supercomputer. Hideki -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
The performance gap is perhaps due to the algorithms. Almost all cluster versions of current strong programs (MoGo, MFG, Fuego and Zen) use root parallel while shared memory computers allow us to use thread parallelism, which gives better performance. I think you should not have troubles with your networks, at least with the number of machines you are considering. Perhaps you should increase a little the time between two communications ? With something like mpi_all_reduce for averaging the statistics over all the tree at each communication, more than 3 or 4 communications per second is useless. Averaging statistics in nodes with less than 5% of the total number of simulations might be useless also. Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] Re: A cluster version of Zen is running on cgos 19x19
Thank you Oliver, Olivier Teytaud: aa5e3c330911242304tc6b9e1bk466b1f08cb65d...@mail.gmail.com: The performance gap is perhaps due to the algorithms. Almost all cluster versions of current strong programs (MoGo, MFG, Fuego and Zen) use root parallel while shared memory computers allow us to use thread parallelism, which gives better performance. I think you should not have troubles with your networks, at least with the number of machines you are considering. Perhaps you should increase a little the time between two communications ? With something like mpi_all_reduce for averaging the statistics over all the tree at each communication, more than 3 or 4 communications per second is useless. Averaging statistics in nodes with less than 5% of the total number of simulations might be useless also. In your (or Sylvain's?) recent paper, you wrote less than one second interval was useless. I've observed similar. I'm now evaluating the performance with 0.2, 0.4, 1 and 4 second intervals for 5 second per move setting on 19x19 board on 32 nodes of HA8000 cluster. Though I have not enough games yet, current best is 1 second interval which improves about 400 Elo in self-play. Then, why we have similar experiments with different implementations of root parallelism, based on different programs and on different clusters? I don't use MPI for the cluster version of Zen. Zen's playouts are slower than MoGo's. Etc... One second is a mysterious time :(. Hideki -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/