I wouldn't try to play with forwarding & tunnelling; always hard to work out what ports get used everywhere, and the services like hostname==URL in paths.
Can't you just set up an entry in the windows /etc/hosts file? It's what I do (on Unix) to talk to VMs > On 25 Aug 2015, at 04:49, Dino Fancellu <d...@felstar.com> wrote: > > Tried adding 50010, 50020 and 50090. Still no difference. > > I can't imagine I'm the only person on the planet wanting to do this. > > Anyway, thanks for trying to help. > > Dino. > > On 25 August 2015 at 08:22, Roberto Congiu <roberto.con...@gmail.com> wrote: >> Port 8020 is not the only port you need tunnelled for HDFS to work. If you >> only list the contents of a directory, port 8020 is enough... for instance, >> using something >> >> val p = new org.apache.hadoop.fs.Path("hdfs://localhost:8020/") >> val fs = p.getFileSystem(sc.hadoopConfiguration) >> fs.listStatus(p) >> >> you should see the file list. >> But then, when accessing a file, you need to actually get its blocks, it has >> to connect to the data node. >> The error 'could not obtain block' means it can't get that block from the >> DataNode. >> Refer to >> http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_reference/content/reference_chap2_1.html >> to see the complete list of ports that also need to be tunnelled. >> >> >> >> 2015-08-24 13:10 GMT-07:00 Dino Fancellu <d...@felstar.com>: >>> >>> Changing the ip to the guest IP address just never connects. >>> >>> The VM has port tunnelling, and it passes through all the main ports, >>> 8020 included to the host VM. >>> >>> You can tell that it was talking to the guest VM before, simply >>> because it said when file not found >>> >>> Error is: >>> >>> Exception in thread "main" org.apache.spark.SparkException: Job >>> aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most >>> recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): >>> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: >>> BP-452094660-10.0.2.15-1437494483194:blk_1073742905_2098 >>> file=/tmp/people.txt >>> >>> but I have no idea what it means by that. It certainly can find the >>> file and knows it exists. >>> >>> >>> >>> On 24 August 2015 at 20:43, Roberto Congiu <roberto.con...@gmail.com> >>> wrote: >>>> When you launch your HDP guest VM, most likely it gets launched with NAT >>>> and >>>> an address on a private network (192.168.x.x) so on your windows host >>>> you >>>> should use that address (you can find out using ifconfig on the guest >>>> OS). >>>> I usually add an entry to my /etc/hosts for VMs that I use often....if >>>> you >>>> use vagrant, there's also a vagrant module that can do that >>>> automatically. >>>> Also, I am not sure how the default HDP VM is set up, that is, if it >>>> only >>>> binds HDFS to 127.0.0.1 or to all addresses. You can check that with >>>> netstat >>>> -a. >>>> >>>> R. >>>> >>>> 2015-08-24 11:46 GMT-07:00 Dino Fancellu <d...@felstar.com>: >>>>> >>>>> I have a file in HDFS inside my HortonWorks HDP 2.3_1 VirtualBox VM. >>>>> >>>>> If I go into the guest spark-shell and refer to the file thus, it works >>>>> fine >>>>> >>>>> val words=sc.textFile("hdfs:///tmp/people.txt") >>>>> words.count >>>>> >>>>> However if I try to access it from a local Spark app on my Windows >>>>> host, >>>>> it >>>>> doesn't work >>>>> >>>>> val conf = new SparkConf().setMaster("local").setAppName("My App") >>>>> val sc = new SparkContext(conf) >>>>> >>>>> val words=sc.textFile("hdfs://localhost:8020/tmp/people.txt") >>>>> words.count >>>>> >>>>> Emits >>>>> >>>>> >>>>> >>>>> The port 8020 is open, and if I choose the wrong file name, it will >>>>> tell >>>>> me >>>>> >>>>> >>>>> >>>>> My pom has >>>>> >>>>> <dependency> >>>>> <groupId>org.apache.spark</groupId> >>>>> <artifactId>spark-core_2.11</artifactId> >>>>> <version>1.4.1</version> >>>>> <scope>provided</scope> >>>>> </dependency> >>>>> >>>>> Am I doing something wrong? >>>>> >>>>> Thanks. >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> >>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Local-Spark-talking-to-remote-HDFS-tp24425.html >>>>> Sent from the Apache Spark User List mailing list archive at >>>>> Nabble.com. >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>> >>>> >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org