Found the issue in JIRA:

https://issues.apache.org/jira/browse/SPARK-4389?jql=project%20%3D%20SPARK%20AND%20text%20~%20NAT

On Tue, Jan 6, 2015 at 10:45 AM, Aaron <aarongm...@gmail.com> wrote:

> From what I can tell, this isn't a "firewall" issue per se..it's how the
> Remoting Service "binds" to an IP given cmd line parameters.  So, if I have
> a VM (or OpenStack or EC2 instance) running on a private network let's say,
> where the IP address is 192.168.X.Y...I can't tell the Workers to "reach me
> on this IP."  Because the Remoting Service binds to the interface passed in
> those parameters.
>
> So, if my "public" IP is a routable IP address...but the one the VM sees
> is the 192.168.X.Y address..it appears I can't do some kinda of port
> forwarding from the external to the internal...is this correct?
>
> If I set spark.driver.host and spark.driver.port properties at the command
> line..it tries to actually bind to that IP..rather than just telling the
> worker..reach back to this IP.  Is there a way around this?  Is there a way
> to tell the workers which IP address to use..WITHOUT, binding to it maybe?
> Maybe allow the Remoting Service to bind to the internal IP..but, advertise
> it differently?
>
>
>
> On Mon, Jan 5, 2015 at 9:02 AM, Aaron <aarongm...@gmail.com> wrote:
>
>> Thanks for the link!  However, from reviewing the thread, it appears you
>> cannot have a NAT/firewall between the cluster and the
>> spark-driver/shell..is this correct?
>>
>> When the shell starts up, it binds to the internal IP (e.g.
>> 192.168.x.y)..not the external floating IP..which is routable from the
>> cluster.
>>
>> When i did set a static port for the spark.driver.port and set the
>> spark.driver.host to the floating IP address...I get the same exception, 
>> (Caused
>> by: java.net.BindException: Cannot assign requested address: bind), because
>> of the use of the InetAddress.getHostAddress method call.
>>
>>
>> Cheers,
>> Aaron
>>
>>
>> On Mon, Jan 5, 2015 at 8:28 AM, Akhil Das <ak...@sigmoidanalytics.com>
>> wrote:
>>
>>> You can have a look at this discussion
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-Spark-job-on-Unix-cluster-from-dev-environment-Windows-td16989.html
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Mon, Jan 5, 2015 at 6:11 PM, Aaron <aarongm...@gmail.com> wrote:
>>>
>>>> Hello there, I was wondering if there is a way to have the spark-shell
>>>> (or pyspark) sit behind a NAT when talking to the cluster?
>>>>
>>>> Basically, we have OpenStack instances that run with internal IPs, and
>>>> we assign floating IPs as needed.  Since the workers make direct TCP
>>>> connections back, the spark-shell is binding to the internal IP..not the
>>>> "floating."  Our other use case is running Vagrant VMs on our local
>>>> machines..but, we don't have those VMs' NICs setup in "bridged" mode..it
>>>> too has an "internal" IP.
>>>>
>>>> I tried using the SPARK_LOCAL_IP, and the various --conf
>>>> spark.driver.host parameters...but it still get's "angry."
>>>>
>>>> Any thoughts/suggestions?
>>>>
>>>> Currently our work around is to VPNC connection from inside the vagrant
>>>> VMs or Openstack instances...but, that doesn't seem like a long term plan.
>>>>
>>>> Thanks in advance!
>>>>
>>>> Cheers,
>>>> Aaron
>>>>
>>>
>>>
>>
>

Reply via email to