Re: Running spark-shell (or queries) over the network (not from master)
Solved. The problem is the following: the underlying Akka driver uses the INTERNAL interface address on the Amazon instance (the ones that start with 10.x.y.z) to present itself to the master, it does not use the external (public) IP! Ognen On 9/7/2014 3:21 PM, Sean Owen wrote: Also keep in mind there is a non-trivial amount of traffic between the driver and cluster. It's not something I would do by default, running the driver so remotely. With enough ports open it should work though. On Sun, Sep 7, 2014 at 7:05 PM, Ognen Duzlevski wrote: Horacio, Thanks, I have not tried that, however, I am not after security right now - I am just wondering why something so obvious won't work ;) Ognen On 9/7/2014 12:38 PM, Horacio G. de Oro wrote: Have you tryied with ssh? It will be much secure (only 1 port open), and you'll be able to run spark-shell over the networ. I'm using that way in my project (https://github.com/data-tsunami/smoke) with good results. I can't make a try now, but something like this should work: ssh -tt ec2-user@YOUR-EC2-IP /path/to/spark-shell SPARK-SHELL-OPTIONS With this approach you are way more secure (without installing a VPN), you don't need spark/hadoop installed on your PC. You won't have acces to local files, but you haven't mentioned that as a requirement :-) Hope this help you. Horacio -- Web: http://www.data-tsunami.com Email: hgde...@gmail.com Cel: +54 9 3572 525359 LinkedIn: https://www.linkedin.com/in/hgdeoro - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
Also keep in mind there is a non-trivial amount of traffic between the driver and cluster. It's not something I would do by default, running the driver so remotely. With enough ports open it should work though. On Sun, Sep 7, 2014 at 7:05 PM, Ognen Duzlevski wrote: > Horacio, > > Thanks, I have not tried that, however, I am not after security right now - > I am just wondering why something so obvious won't work ;) > > Ognen > > > On 9/7/2014 12:38 PM, Horacio G. de Oro wrote: >> >> Have you tryied with ssh? It will be much secure (only 1 port open), >> and you'll be able to run spark-shell over the networ. I'm using that >> way in my project (https://github.com/data-tsunami/smoke) with good >> results. >> >> I can't make a try now, but something like this should work: >> >> ssh -tt ec2-user@YOUR-EC2-IP /path/to/spark-shell SPARK-SHELL-OPTIONS >> >> With this approach you are way more secure (without installing a VPN), >> you don't need spark/hadoop installed on your PC. You won't have acces >> to local files, but you haven't mentioned that as a requirement :-) >> >> Hope this help you. >> >> Horacio >> -- >> >>Web: http://www.data-tsunami.com >> Email: hgde...@gmail.com >>Cel: +54 9 3572 525359 >> LinkedIn: https://www.linkedin.com/in/hgdeoro > > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
Horacio, Thanks, I have not tried that, however, I am not after security right now - I am just wondering why something so obvious won't work ;) Ognen On 9/7/2014 12:38 PM, Horacio G. de Oro wrote: Have you tryied with ssh? It will be much secure (only 1 port open), and you'll be able to run spark-shell over the networ. I'm using that way in my project (https://github.com/data-tsunami/smoke) with good results. I can't make a try now, but something like this should work: ssh -tt ec2-user@YOUR-EC2-IP /path/to/spark-shell SPARK-SHELL-OPTIONS With this approach you are way more secure (without installing a VPN), you don't need spark/hadoop installed on your PC. You won't have acces to local files, but you haven't mentioned that as a requirement :-) Hope this help you. Horacio -- Web: http://www.data-tsunami.com Email: hgde...@gmail.com Cel: +54 9 3572 525359 LinkedIn: https://www.linkedin.com/in/hgdeoro - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
Have you tryied with ssh? It will be much secure (only 1 port open), and you'll be able to run spark-shell over the networ. I'm using that way in my project (https://github.com/data-tsunami/smoke) with good results. I can't make a try now, but something like this should work: ssh -tt ec2-user@YOUR-EC2-IP /path/to/spark-shell SPARK-SHELL-OPTIONS With this approach you are way more secure (without installing a VPN), you don't need spark/hadoop installed on your PC. You won't have acces to local files, but you haven't mentioned that as a requirement :-) Hope this help you. Horacio -- Web: http://www.data-tsunami.com Email: hgde...@gmail.com Cel: +54 9 3572 525359 LinkedIn: https://www.linkedin.com/in/hgdeoro - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
Have you actually tested this? I have two instances, one is standalone master and the other one just has spark installed, same versions of spark (1.0.0). The security group on the master allows all (0-65535) TCP and UDP traffic from the other machine and the other machine allows all TCP/UDP traffic from master. Yet my spark-shell --master spark://master-ip:7077 still is failing to connect. What am I missing? Thanks! Ognen On 9/5/2014 5:34 PM, qihong wrote: Since you are using your home computer, so it's probably not reachable by EC2 from internet. You can try to set "spark.driver.host" to your WAN ip, "spark.driver.port" to a fixed port in SparkConf, and open that port in your home network (port forwarding to the computer you are using). see if that helps. -- View this message in context:http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13595.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail:user-unsubscr...@spark.apache.org For additional commands, e-mail:user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
Ah. So there is some kind of a "back and forth" going on. Thanks! Ognen On 9/5/2014 5:34 PM, qihong wrote: Since you are using your home computer, so it's probably not reachable by EC2 from internet. You can try to set "spark.driver.host" to your WAN ip, "spark.driver.port" to a fixed port in SparkConf, and open that port in your home network (port forwarding to the computer you are using). see if that helps. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13595.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
Since you are using your home computer, so it's probably not reachable by EC2 from internet. You can try to set "spark.driver.host" to your WAN ip, "spark.driver.port" to a fixed port in SparkConf, and open that port in your home network (port forwarding to the computer you are using). see if that helps. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13595.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
That is the command I ran and it still times out.Besides 7077 is there any other port that needs to be open? Thanks! Ognen On 9/5/2014 4:10 PM, qihong wrote: the command should be "spark-shell --master spark://:7077". -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13593.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
the command should be "spark-shell --master spark://:7077". -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13593.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
On 9/5/2014 3:27 PM, anthonyjschu...@gmail.com wrote: I think that should be possible. Make sure spark is installed on your local machine and is the same version as on the cluster. It is the same version, I can telnet to master:7077 but when I run the spark-shell it times out. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Running spark-shell (or queries) over the network (not from master)
I think that should be possible. Make sure spark is installed on your local machine and is the same version as on the cluster. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13590.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org