[julia-users] Re: Is there a tutorial on how to set up my own Julia cluster?

2015-10-28 Thread Dan
perhaps the [count*] notation means: repeat the line count times i.e.:
hd1
hd1
hd2
hd2
:

no time to dig or test yet, so this is just another guess.

On Wednesday, October 28, 2015 at 9:36:08 PM UTC+2, Ismael VC wrote:
>
> Hello everyone,
>
> I have succesfully added all nodes and I can init julia like this:
>
> [root@hd0 ~]# julia -p 2 --machinefile Beowulf
>_
>_   _ _(_)_ |  A fresh approach to technical computing
>   (_) | (_) (_)|  Documentation: http://docs.julialang.org
>_ _   _| |_  __ _   |  Type "help()" for help.
>   | | | | | | |/ _` |  |
>   | | |_| | | | (_| |  |  Version 0.3.11 (2015-07-27 06:18 UTC)
>  _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
> |__/   |  x86_64-unknown-linux-gnu
>
>
> julia> nprocs()
> 22
>
>
> julia> nworkers()
> 21
>
>
> julia> 
>
>
> Where Beowulf file is like this:
>
> hd1
> hd2
> hd3
> hd4
> hd5
> hd6
> hd7
> hd8
> hd9
> hd10
> hd11
> hd12
> hd13
> hd14
> hd15
> hd16
> hd17
> hd18
> hd19
>
> If I change it to:
>
> 2 hd1
> 2 hd2
> 2 hd3
> 2 hd4
> 2 hd5
> 2 hd6
> 2 hd7
> 2 hd8
> 2 hd9
> 2 hd10
> 2 hd11
> 2 hd12
> 2 hd13
> 2 hd14
> 2 hd15
> 2 hd16
> 2 hd17
> 2 hd18
> 2 hd19
>
>
>
> I get the same error I mentioned:
>
> [root@hd0 ~]# julia -p 2 --machinefile Beowulf2
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ssh: connect to host 2 port 22: Invalid argument
> ^CERROR: interrupt
>  in match at ./regex.jl:119
>  in parse_connection_info at multi.jl:1090
>  in read_worker_host_port at multi.jl:1037
>  in read_cb_response at multi.jl:1015
>  in start_cluster_workers at multi.jl:1027
>  in addprocs_internal at multi.jl:1234
>  in addprocs at multi.jl:1244
>  in process_options at ./client.jl:240
>  in _start at ./client.jl:354
> [root@hd0 ~]#
>
>
>
> El viernes, 25 de septiembre de 2015, 16:42:59 (UTC-5), Ismael VC escribió:
>
> Hello everyone!
>
> I am trying to set up a Julia cluster with 20 nodes, this is the very 
> first time I've tried something like this. I have looked around for 
> examples, but documentation is not very helpful for me:
>
> *Julia can be started in parallel mode with either the -p or 
> the --machinefile options. -p n will launch an additional n worker 
> processes, while --machinefile file will launch a worker for each line in 
> file file. The machines defined in file must be accessible via a 
> passwordless ssh login, with Julia installed at the same location as the 
> current host. Each machine definition takes the 
> form [count*][user@]host[:port] [bind_addr[:port]] . user defaults to 
> current user, port to the standard ssh port. count is the number of workers 
> to spawn on the node, and defaults to 1. The 
> optional bind-to bind_addr[:port] specifies the ip-address and port that 
> other workers should use to connect to this worker.*
>
> This is what I think I have understood so far:
>
> Ok I list the machines on a machine file, that's easy, I have a file like 
> this:
>
> n user@555.555.555.555
> n user@555.555.555.556
> n user@555.555.555.555
>
>
> *The machines defined in file must be accessible via a 
> passwordless ssh login,*
>
> This is the part that is difficult for me the most, it says that machines 
> must be accesible via paswordless ssh
>
> * with Julia installed at the same location as the current host.*
>
> I understand this as I need to install Julia en every node in the same 
> locat
>
> ...



[julia-users] Re: Is there a tutorial on how to set up my own Julia cluster?

2015-10-28 Thread Ismael VC
Thank you Seth, the count arg is not supported in 0.3.x, I'll update 
shortly to 0.4.x

El miércoles, 28 de octubre de 2015, 13:42:55 (UTC-6), Seth escribió:
>
>
> On Wednesday, October 28, 2015 at 10:20:00 AM UTC-7, Ismael VC wrote:
>>
>> How can I start 2 workers on each node, using Julia 0.3.11?
>>
>> [count*][user@]host[:port] [bind_addr[:port]]
>>
>> The way I understand:
>>
>> [count*][user@]host[:port] [bind_addr[:port]]
>>
>> Is that `count` is an integer while `*` means zero or more repetitions in 
>> REGEX lang, 
>> at first it seems it doesn't need a space character between the count and 
>> the `user@host`,
>> but I have tried several forms and it doesn't work:
>>
>
> I don't think your interpretation is correct. I think the "*" is syntax 
> for "(this many) times". Did you try appending an asterisk after the 
> number? That is, "2* user@host "?
>


[julia-users] Re: Is there a tutorial on how to set up my own Julia cluster?

2015-10-28 Thread Ismael VC
Thank you very much Greg that worked! :D

El miércoles, 28 de octubre de 2015, 13:31:53 (UTC-6), Greg Plowman 
escribió:
>
> On v0.3 try multiple entries (lines) in machine file, one for each worker.



[julia-users] Re: Is there a tutorial on how to set up my own Julia cluster?

2015-10-28 Thread Seth

On Wednesday, October 28, 2015 at 10:20:00 AM UTC-7, Ismael VC wrote:
>
> How can I start 2 workers on each node, using Julia 0.3.11?
>
> [count*][user@]host[:port] [bind_addr[:port]]
>
> The way I understand:
>
> [count*][user@]host[:port] [bind_addr[:port]]
>
> Is that `count` is an integer while `*` means zero or more repetitions in 
> REGEX lang, 
> at first it seems it doesn't need a space character between the count and 
> the `user@host`,
> but I have tried several forms and it doesn't work:
>

I don't think your interpretation is correct. I think the "*" is syntax for 
"(this many) times". Did you try appending an asterisk after the number? 
That is, "2* user@host "?


[julia-users] Re: Is there a tutorial on how to set up my own Julia cluster?

2015-10-28 Thread Ismael VC
Hello everyone,

I have succesfully added all nodes and I can init julia like this:

[root@hd0 ~]# julia -p 2 --machinefile Beowulf
   _
   _   _ _(_)_ |  A fresh approach to technical computing
  (_) | (_) (_)|  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.11 (2015-07-27 06:18 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/   |  x86_64-unknown-linux-gnu


julia> nprocs()
22


julia> nworkers()
21


julia> 


Where Beowulf file is like this:

hd1
hd2
hd3
hd4
hd5
hd6
hd7
hd8
hd9
hd10
hd11
hd12
hd13
hd14
hd15
hd16
hd17
hd18
hd19

If I change it to:

2 hd1
2 hd2
2 hd3
2 hd4
2 hd5
2 hd6
2 hd7
2 hd8
2 hd9
2 hd10
2 hd11
2 hd12
2 hd13
2 hd14
2 hd15
2 hd16
2 hd17
2 hd18
2 hd19



I get the same error I mentioned:

[root@hd0 ~]# julia -p 2 --machinefile Beowulf2
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
ssh: connect to host 2 port 22: Invalid argument
^CERROR: interrupt
 in match at ./regex.jl:119
 in parse_connection_info at multi.jl:1090
 in read_worker_host_port at multi.jl:1037
 in read_cb_response at multi.jl:1015
 in start_cluster_workers at multi.jl:1027
 in addprocs_internal at multi.jl:1234
 in addprocs at multi.jl:1244
 in process_options at ./client.jl:240
 in _start at ./client.jl:354
[root@hd0 ~]#



El viernes, 25 de septiembre de 2015, 16:42:59 (UTC-5), Ismael VC escribió:
>
> Hello everyone!
>
> I am trying to set up a Julia cluster with 20 nodes, this is the very 
> first time I've tried something like this. I have looked around for 
> examples, but documentation is not very helpful for me:
>
> *Julia can be started in parallel mode with either the -p or 
> the --machinefile options. -p n will launch an additional n worker 
> processes, while --machinefile file will launch a worker for each line in 
> file file. The machines defined in file must be accessible via a 
> passwordless ssh login, with Julia installed at the same location as the 
> current host. Each machine definition takes the 
> form [count*][user@]host[:port] [bind_addr[:port]] . user defaults to 
> current user, port to the standard ssh port. count is the number of workers 
> to spawn on the node, and defaults to 1. The 
> optional bind-to bind_addr[:port] specifies the ip-address and port that 
> other workers should use to connect to this worker.*
>
> This is what I think I have understood so far:
>
> Ok I list the machines on a machine file, that's easy, I have a file like 
> this:
>
> n user@555.555.555.555
> n user@555.555.555.556
> n user@555.555.555.555
>
>
> *The machines defined in file must be accessible via a 
> passwordless ssh login,*
>
> This is the part that is difficult for me the most, it says that machines 
> must be accesible via paswordless ssh
>
> * with Julia installed at the same location as the current host.*
>
> I understand this as I need to install Julia en every node in the same 
> location, so I have 20 nodes, same software and hardware stacks. Does this 
> means that the nodes must be of the same operating system? the same bits 
> (32/64) only?
>
> Right now I have *20 CentOS 6.7 (64 bits)* nodes with* julia-0.3.11* 
> installed from the *generic linux binaries (64bits)*, all of them 
> installed at */opt/julia-0.3.11/bin* (added to the PATH and already 
> exported in /etc/profile)
>
> Now the plan in my mind is to use my laptop *(windows 7 64 bits, 
> julia-0.3.11 64 bits)* as master node and control the cluster with that, 
> so according to what I understand, I'll need to do (leaving password blank):
>
> ssh-keygen -t rsa
>
>
> From my Windows laptop (I plan to install Arch Linux soon), in order to 
> create my ssh key and then:
>
>
> cat ~/.ssh/id_rsa.pub | ssh user@hostname 'cat >> .ssh/authorized_keys'
>
>
>
> To every node? So I have to be running the ssh server at every one of them? 
> (I understand I'll need it at the master node) This is where I simply don't 
> understand anymore, I haven't seen any tutorial, or article, or something 
> like that, just that paragraph in the manual, I know there

[julia-users] Re: Is there a tutorial on how to set up my own Julia cluster?

2015-10-28 Thread 'Greg Plowman' via julia-users
On v0.3 try multiple entries (lines) in machine file, one for each worker.

[julia-users] Re: Is there a tutorial on how to set up my own Julia cluster?

2015-10-28 Thread Ismael VC
How can I start 2 workers on each node, using Julia 0.3.11?

[count*][user@]host[:port] [bind_addr[:port]]

I have a machine file, with only one node (one line), this examples are the 
ways it works, 
but adding only one worker per node, I'm using the default port for now and 
not using a different bind address:


   - Only host:

555.555.555.555

   - User and host:

root@555.555.555.555


The way I understand:

[count*][user@]host[:port] [bind_addr[:port]]

Is that `count` is an integer while `*` means zero or more repetitions in 
REGEX lang, 
at first it seems it doesn't need a space character between the count and 
the `user@host`,
but I have tried several forms and it doesn't work:

* Use `2` as `count`, separated by space, with `my_file` being either:

2 555.555.555.555

or

2 root@555.555.555.555

[root@example ~]# julia --machinefile my_file
ssh: connect to host 2 port 22: Invalid argument


It seems to me it tries to use the 2 as the host address :(

Could anyone please give me an example off a machine file which specifies 
the worker count?

Thanks in advance, cheers! 





El viernes, 25 de septiembre de 2015, 16:42:59 (UTC-5), Ismael VC escribió:
>
> Hello everyone!
>
> I am trying to set up a Julia cluster with 20 nodes, this is the very 
> first time I've tried something like this. I have looked around for 
> examples, but documentation is not very helpful for me:
>
> *Julia can be started in parallel mode with either the -p or 
> the --machinefile options. -p n will launch an additional n worker 
> processes, while --machinefile file will launch a worker for each line in 
> file file. The machines defined in file must be accessible via a 
> passwordless ssh login, with Julia installed at the same location as the 
> current host. Each machine definition takes the 
> form [count*][user@]host[:port] [bind_addr[:port]] . user defaults to 
> current user, port to the standard ssh port. count is the number of workers 
> to spawn on the node, and defaults to 1. The 
> optional bind-to bind_addr[:port] specifies the ip-address and port that 
> other workers should use to connect to this worker.*
>
> This is what I think I have understood so far:
>
> Ok I list the machines on a machine file, that's easy, I have a file like 
> this:
>
> n user@555.555.555.555
> n user@555.555.555.556
> n user@555.555.555.555
>
>
> *The machines defined in file must be accessible via a 
> passwordless ssh login,*
>
> This is the part that is difficult for me the most, it says that machines 
> must be accesible via paswordless ssh
>
> * with Julia installed at the same location as the current host.*
>
> I understand this as I need to install Julia en every node in the same 
> location, so I have 20 nodes, same software and hardware stacks. Does this 
> means that the nodes must be of the same operating system? the same bits 
> (32/64) only?
>
> Right now I have *20 CentOS 6.7 (64 bits)* nodes with* julia-0.3.11* 
> installed from the *generic linux binaries (64bits)*, all of them 
> installed at */opt/julia-0.3.11/bin* (added to the PATH and already 
> exported in /etc/profile)
>
> Now the plan in my mind is to use my laptop *(windows 7 64 bits, 
> julia-0.3.11 64 bits)* as master node and control the cluster with that, 
> so according to what I understand, I'll need to do (leaving password blank):
>
> ssh-keygen -t rsa
>
>
> From my Windows laptop (I plan to install Arch Linux soon), in order to 
> create my ssh key and then:
>
>
> cat ~/.ssh/id_rsa.pub | ssh user@hostname 'cat >> .ssh/authorized_keys'
>
>
>
> To every node? So I have to be running the ssh server at every one of them? 
> (I understand I'll need it at the master node) This is where I simply don't 
> understand anymore, I haven't seen any tutorial, or article, or something 
> like that, just that paragraph in the manual, I know there is 
> ClusterManagers.jl but that sounds even more complicated for me right now.
>
>
> I also want to help David Sanders to set up another cluster (once I got this 
> figured out) in his lab at Science Faculty, UNAM. I promise to enhance the 
> documentation around this topic once I understand this.
>
>
> What do you guys think, do I have it all wrong?
>
>
> If anyone can help me, I'll be very grateful, thank's in advance!
>
>

[julia-users] Re: Is there a tutorial on how to set up my own Julia cluster?

2015-09-26 Thread michael . creel
A Julia cluster is just a cluster with Julia installed on all the nodes. 
One way of achieving this is to create a cluster using PelicanHPC, and then 
do one of:
1) install julia in the /home/user directory, for example, by compiling 
from source. This directory is NFS shared by all nodes, when the cluster is 
set up.
or
2) run apt-get install julia on all the nodes.

A PHPC cluster is a reasonable solution for a single user. I used to 
develop it, and used it for a number of years on a 4 node cluster. It's 
Debian-based.

On Friday, September 25, 2015 at 11:42:59 PM UTC+2, Ismael VC wrote:
>
> Hello everyone!
>
> I am trying to set up a Julia cluster with 20 nodes, this is the very 
> first time I've tried something like this. I have looked around for 
> examples, but documentation is not very helpful for me:
>
> *Julia can be started in parallel mode with either the -p or 
> the --machinefile options. -p n will launch an additional n worker 
> processes, while --machinefile file will launch a worker for each line in 
> file file. The machines defined in file must be accessible via a 
> passwordless ssh login, with Julia installed at the same location as the 
> current host. Each machine definition takes the 
> form [count*][user@]host[:port] [bind_addr[:port]] . user defaults to 
> current user, port to the standard ssh port. count is the number of workers 
> to spawn on the node, and defaults to 1. The 
> optional bind-to bind_addr[:port] specifies the ip-address and port that 
> other workers should use to connect to this worker.*
>
> This is what I think I have understood so far:
>
> Ok I list the machines on a machine file, that's easy, I have a file like 
> this:
>
> n user@555.555.555.555
> n user@555.555.555.556
> n user@555.555.555.555
>
>
> *The machines defined in file must be accessible via a 
> passwordless ssh login,*
>
> This is the part that is difficult for me the most, it says that machines 
> must be accesible via paswordless ssh
>
> * with Julia installed at the same location as the current host.*
>
> I understand this as I need to install Julia en every node in the same 
> location, so I have 20 nodes, same software and hardware stacks. Does this 
> means that the nodes must be of the same operating system? the same bits 
> (32/64) only?
>
> Right now I have *20 CentOS 6.7 (64 bits)* nodes with* julia-0.3.11* 
> installed from the *generic linux binaries (64bits)*, all of them 
> installed at */opt/julia-0.3.11/bin* (added to the PATH and already 
> exported in /etc/profile)
>
> Now the plan in my mind is to use my laptop *(windows 7 64 bits, 
> julia-0.3.11 64 bits)* as master node and control the cluster with that, 
> so according to what I understand, I'll need to do (leaving password blank):
>
> ssh-keygen -t rsa
>
>
> From my Windows laptop (I plan to install Arch Linux soon), in order to 
> create my ssh key and then:
>
>
> cat ~/.ssh/id_rsa.pub | ssh user@hostname 'cat >> .ssh/authorized_keys'
>
>
>
> To every node? So I have to be running the ssh server at every one of them? 
> (I understand I'll need it at the master node) This is where I simply don't 
> understand anymore, I haven't seen any tutorial, or article, or something 
> like that, just that paragraph in the manual, I know there is 
> ClusterManagers.jl but that sounds even more complicated for me right now.
>
>
> I also want to help David Sanders to set up another cluster (once I got this 
> figured out) in his lab at Science Faculty, UNAM. I promise to enhance the 
> documentation around this topic once I understand this.
>
>
> What do you guys think, do I have it all wrong?
>
>
> If anyone can help me, I'll be very grateful, thank's in advance!
>
>