Re: fail and kill all tasks without killing job.
I believe that kill-task simple kills the task, but then the same process (i.e. task) starts, with a new id. Jay Vyas MMSB UCHC On Jul 20, 2012, at 6:23 PM, Bejoy KS bejoy.had...@gmail.com wrote: Hi Jay Did you try hadoop job -kill-task task-id ? And is that not working as desired? Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: jay vyas jayunit...@gmail.com Date: Fri, 20 Jul 2012 17:17:58 To: common-user@hadoop.apache.orgcommon-user@hadoop.apache.org Reply-To: common-user@hadoop.apache.org Subject: fail and kill all tasks without killing job. Hi guys : I want my tasks to end/fail, but I don't want to kill my entire hadoop job. I have a hadoop job that runs 5 hadoop jobs in a row. Im on the last of those sub-jobs, and want to fail all tasks so that the task tracker stops delegating them, and the hadoop main job can naturally come to a close. However, when I run hadoop job kill-attempt / fail-attempt , the jobtracker seems to simply relaunch the same tasks with new ids. How can I tell the jobtracker to give up on redelegating?
Re: remote job submission
Thanks j harsh: I have another question , though --- You mentioned that : The client needs access to the DataNodes (for actually writing the previous files to DFS for the JobTracker to pick up) What do you mean by previous files? It seems like, if designing Hadoop from scratch , I wouldn't want to force the client to communicate with data nodes at all, since those can be added and removed during a job. Jay Vyas MMSB UCHC On Apr 21, 2012, at 1:14 AM, Harsh J ha...@cloudera.com wrote: the DataNodes (for actually writing the previous files to DFS for the JobTracker to pick up)
Re: Accessing global Counters
No reducers can't access mapper counters. --- maybe theres a way to intermediately put counters in the distributed cache??? Jay Vyas MMSB UCHC On Apr 20, 2012, at 1:24 PM, Robert Evans ev...@yahoo-inc.com wrote: There was a discussion about this several months ago http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201112.mbox/%3CCADYHM8xiw8_bF=zqe-bagdfz6r3tob0aof9viozgtzeqgkp...@mail.gmail.com%3E The conclusion is that if you want to read them from the reducer you are going to have to do something special until someone finds time to implement it as part of. https://issues.apache.org/jira/browse/MAPREDUCE-3520 --Bobby Evans On 4/20/12 11:36 AM, Amith D K amit...@huawei.com wrote: Yes U can use user defined counter as Jagat suggeted. Counter can be enum as Jagat described or any string which are called dynamic counters. It is easier to use Enum counter than dynamic counters, finally it depends on your use case :) Amith From: Jagat [jagatsi...@gmail.com] Sent: Saturday, April 21, 2012 12:25 AM To: common-user@hadoop.apache.org Subject: Re: Accessing global Counters Hi You can create your own counters like enum CountFruits { Apple, Mango, Banana } And in your mapper class when you see condition to increment , you can use Reporter incrCounter method to do the same. http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Reporter.html#incrCounter(java.lang.Enum,%20long) e.g // I saw Apple increment it by one reporter.incrCounter(CountFruits.Apple,1); Now you can access them using job.getCounters http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html#getCounters() Hope this helps Regards, Jagat Singh On Fri, Apr 20, 2012 at 9:43 PM, Gayatri Rao rgayat...@gmail.com wrote: Hi All, Is there a way for me to set global counters in Mapper and access them from reducer? Could you suggest how I can acheve this? Thanks Gayatri
Re: remote job submission
RE anirunds question on how to submit a job remotely. Here are my follow up questions - hope this helps to guide the discussion: 1) Normally - what is the job client? Do you guys typically use the namenode as the client? 2) In the case where the client != name node how does the client know how to start up the task trackers ? UCHC On Apr 20, 2012, at 11:19 AM, Amith D K amit...@huawei.com wrote: I dont know your use case if its for test and ssh across the machine are disabled then u write a script that can do ssh run the jobs using cli for running your jobs. U can check ssh usage. Or else use Ooze From: Robert Evans [ev...@yahoo-inc.com] Sent: Friday, April 20, 2012 11:17 PM To: common-user@hadoop.apache.org Subject: Re: remote job submission You can use Oozie to do it. On 4/20/12 8:45 AM, Arindam Choudhury arindamchoudhu...@gmail.com wrote: Sorry. But I can you give me a example. On Fri, Apr 20, 2012 at 3:08 PM, Harsh J ha...@cloudera.com wrote: Arindam, If your machine can access the clusters' NN/JT/DN ports, then you can simply run your job from the machine itself. On Fri, Apr 20, 2012 at 6:31 PM, Arindam Choudhury arindamchoudhu...@gmail.com wrote: If you are allowed a remote connection to the cluster's service ports, then you can directly submit your jobs from your local CLI. Just make sure your local configuration points to the right locations. Can you elaborate in details please? On Fri, Apr 20, 2012 at 2:20 PM, Harsh J ha...@cloudera.com wrote: If you are allowed a remote connection to the cluster's service ports, then you can directly submit your jobs from your local CLI. Just make sure your local configuration points to the right locations. Otherwise, perhaps you can choose to use Apache Oozie (Incubating) (http://incubator.apache.org/oozie/) It does provide a REST interface that launches jobs up for you over the supplied clusters, but its more oriented towards workflow management or perhaps HUE: https://github.com/cloudera/hue On Fri, Apr 20, 2012 at 5:37 PM, Arindam Choudhury arindamchoudhu...@gmail.com wrote: Hi, Do hadoop have any web service or other interface so I can submit jobs from remote machine? Thanks, Arindam -- Harsh J -- Harsh J
Reporter vs context
Hi guys : I notice that there's been some chatter about the Reporter in context of counters Forgive my ignorance here as I've never seen Reporters used in real code. What is the difference between the use of our Context, and Reporter objects- and how are they related? Is there any overlap in their functionality.? Jay Vyas MMSB UCHC
Re: Issue with loading the Snappy Codec
That is odd why would it crash when your m/r job did not rely on snappy? One possibility : Maybe because your input is snappy compressed, Hadoop is detecting that compression, and trying to use the snappy codec to decompress.? Jay Vyas MMSB UCHC On Apr 15, 2012, at 5:08 AM, Bas Hickendorff hickendorff...@gmail.com wrote: Hello John, I did restart them (in fact, I did a full reboot of the machine). The error is still there. I guess my question is: is it expected that Hadoop needs to do something with the Snappycodec when mapred.compress.map.output is set to false? Regards, Bas On Sun, Apr 15, 2012 at 12:04 PM, john smith js1987.sm...@gmail.com wrote: Can you restart tasktrackers once and run the job again? It refreshes the class path. On Sun, Apr 15, 2012 at 11:58 AM, Bas Hickendorff hickendorff...@gmail.comwrote: Thanks. The native snappy libraries I have installed. However, I use the normal jars that you get when downloading Hadoop, I am not compiling Hadoop myself. I do not want to use the snappy codec (I don't care about compression at the moment), but it seems it is needed anyway? I added this to the mapred-site.xml: property namemapred.compress.map.output/name valuefalse/value /property But it still fails with the error of my previous email (SnappyCodec not found). Regards, Bas On Sat, Apr 14, 2012 at 6:30 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Hadoop has integrated snappy via installed native libraries instead of snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206) - You need to have the snappy system libraries (snappy and snappy-devel) installed before you compile hadoop. (RPMs are available on the web, http://pkgs.org/centos-5-rhel-5/epel-i386/21/ for example) - When you build hadoop, you will need to compile the native libraries(by passing -Dcompile.native=true to ant) to avail snappy support. - You also need to make sure that snappy system library is available on the library path for all mapreduce tasks at runtime. Usually if you install them on /usr/lib or /usr/local/lib, it should work. HTH, +Vinod On Apr 14, 2012, at 4:36 AM, Bas Hickendorff wrote: Hello, When I start a map-reduce job, it starts, and after a short while, fails with the error below (SnappyCodec not found). I am currently starting the job from other Java code (so the Hadoop executable in the bin directory is not used anymore), but in principle this seems to work (in the admin of the Jobtracker the job shows up when it starts). However after a short while the map task fails with: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.SnappyCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.SnappyCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89) ... 10 more I confirmed that the SnappyCodec class is present in the hadoop-core-1.0.2.jar, and the snappy-java-1.0.4.1.jar is present as well. The directory of those jars is on the HADOOP_CLASSPATH, but it seems it still cannot find it. I also checked that the config files of Hadoop are read. I run all nodes on localhost. Any suggestions on what could be the cause of the issue? Regards, Bas
Re: Professional Hiring: Architect and Developer in Hadoop Area ( Beijing, China )
Im sure i speak quite accurately for the moderators that ***This is not a job board*** Jay Vyas MMSB UCHC On Apr 9, 2012, at 10:03 AM, Vishal Kumar Gupta groups...@gmail.com wrote: hi Sarah, Please find my updated resume attached with this mail. Regards, vishal 2012/4/9 Bing Li sarah.lib...@gmail.com 国际著名大型IT企业(排名前3位)开发中心招聘Hadoop技术专家(北京)-非猎头 职位描述: Hadoop系统和平台开发(架构师,资深开发人员) 职位要求: 1.有设计开发大型分布式系统的经验(工作年限3年以上,架构师5年以上),hadoop大型实际应用经验优先 2.良好的编程和调试经验(java or c++/c),扎实的计算机理论基础,快速的学习能力 3. 沟通和合作能力强,熟练使用英语(包括口语) *我们将提供有竞争力的待遇,欢迎加入我们* 有意请发简历到邮箱: sarah.lib...@gmail.com
Job, JobConf, and Configuration.
Hi guys. Just a theoretical question here : I notice in chapter 1 of the Hadoop orielly book that the new API example has *no* Configuration object. Why is that? I thought the new API still uses / needs a Configuration class when running jobs. Jay Vyas MMSB UCHC On Apr 7, 2012, at 4:29 PM, Harsh J ha...@cloudera.com wrote: MapReduce sets mapred.child.tmp for all tasks to be the Task Attempt's WorkingDir/tmp automatically. This also sets the -Djava.io.tmpdir prop for each task at JVM boot. Hence you may use the regular Java API to create a temporary file: http://docs.oracle.com/javase/6/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String) These files would also be automatically deleted away after the task attempt is done. On Sun, Apr 8, 2012 at 2:14 AM, Ondřej Klimpera klimp...@fit.cvut.cz wrote: Hello, I would like to ask you if it is possible to create and work with a temporary file while in a map function. I suppose that map function is running on a single node in Hadoop cluster. So what is a safe way to create a temporary file and read from it in one map() run. If it is possible is there a size limit for the file. The file can not be created before hadoop job is created. I need to create and process the file inside map(). Thanks for your answer. Ondrej Klimpera. -- Harsh J
Re: Get Current Block or Split ID, and using it, the Block Path
I have a related question about blocks related to thisNormally, a reduce job outputs several files, all in the same directory. But why? Since we know that Hadoop is abstracting our file for us, shouldn't the part-r- outputs ultimately be thought of as a single file? What is the correspondence between the Part-r- Part-r-0001 . Outputs from a reducer, and the native blocks stored by Hfds (if any). Jay Vyas MMSB UCHC On Apr 8, 2012, at 2:00 PM, Harsh J ha...@cloudera.com wrote: Deepak On Sun, Apr 8, 2012 at 9:46 PM, Deepak Nettem deepaknet...@gmail.com wrote: Hi, Is it possible to get the 'id' of the currently executing split or block from within the mapper? Using this block Id / split id, I want to be able to query the namenode to get the names of hosts having that block / spllit, and the actual path to the data. You can get the list of host locations for the current Mapper's split item via: https://gist.github.com/2339170 (or generally from a FileSystem object via https://gist.github.com/2339181) You can't get block IDs via any available publicly supported APIs. Therefore, you may consider getting the local block file path as an unavailable option too. I need this for some analytics that I'm doing. Is there a client API that allows doing this? If not, what's the best way to do this? There are some ways to go about it (I wouldn't consider this impossible to do for sure), but I'm curious what your 'analytics' is and how it correlates with needing block IDs and actual block file paths - Cause your problem may also be solvable by other, pre-available means. -- Harsh J
Namespace logs : a common issue?
Hi guys : I'm noticing that namespace conflicts or differences are a common theme in hadoop both in my experience and now on this list Serv. Does anyone have any thoughts on why this is such a common issue and how it will be dealt with in new releases? Jay Vyas MMSB UCHC
Hadoop fs custom commands
Hi guys : I wanted to make se custom Hadoop fs - commands. Is this feasible/practical? In particular. , I wanted to summarize file sizes and print some usefull estimated of things on the fly from My cluster. I'm not sure how The hadoop Shell commands are implemented... But I thought maybe there is a higher level Shell language or API which they might use that I can play with.?
Re: namespace error after formatting namenode (psuedo distr mode).
Thanks alot arpit : I will try this first thing in the morning. For now --- I need a glass of wine. Jay Vyas MMSB UCHC On Mar 30, 2012, at 10:38 PM, Arpit Gupta ar...@hortonworks.com wrote: the namespace id is persisted on the datanode data directories. As you formatted the namenode these id's no longer match. So stop the datanode clean up your dfs.data.dir on your system which from the logs seems to be /private/tmp/hadoop-Jpeerindex/dfs/data and then start the datanode. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Mar 30, 2012, at 2:33 PM, Jay Vyas wrote: Hi guys ! This is very strange - I have formatted my namenode (psuedo distributed mode) and now Im getting some kind of namespace error. Without further ado : here is the interesting output of my logs . Last login: Fri Mar 30 19:29:12 on ttys009 doolittle-5:~ Jpeerindex$ doolittle-5:~ Jpeerindex$ doolittle-5:~ Jpeerindex$ cat Development/hadoop-0.20.203.0/logs/* 2012-03-30 22:28:31,640 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = doolittle-5.local/192.168.3.78 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.203.0 STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r 1099333; compiled by 'oom' on Wed May 4 07:57:50 PDT 2011 / 2012-03-30 22:28:32,138 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2012-03-30 22:28:32,190 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2012-03-30 22:28:32,191 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2012-03-30 22:28:32,191 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2012-03-30 22:28:32,923 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2012-03-30 22:28:32,959 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2012-03-30 22:28:34,478 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s). 2012-03-30 22:28:36,317 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /private/tmp/hadoop-Jpeerindex/dfs/data: namenode namespaceID = 1829914379; datanode namespaceID = 1725952472 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:354) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:268) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1480) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1419) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1437) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1563) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1573)
Re: Question about accessing another HDFS
I was confused about this for a while also I dont have all the details but I think my question on s.o. might help you. I was playing with different protocols... Trying to find a way to programatically access all data in Hfds. http://stackoverflow.com/questions/7844458/how-can-i-access-hadoop-via-the-hdfs-protocol-from-java Jay Vyas MMSB UCHC On Dec 8, 2011, at 7:29 PM, Frank Astier fast...@yahoo-inc.com wrote: Can you show your code here ? What URL protocol are you using ? I’m guess I’m being very naïve (and relatively new to HDFS). I can’t show too much code, but basically, I’d like to do: Path myPath = new Path(“hdfs://A.mycompany.com//some-dir”); Where Path is a hadoop fs path. I think I can take it from there, if that worked... Did you mean that I need to address the namenode with an http:// address? Thanks! Frank On Thu, Dec 8, 2011 at 5:47 PM, Tom Melendez t...@supertom.com wrote: I'm hoping there is a better answer, but I'm thinking you could load another configuration file (with B.company in it) using Configuration, grab a FileSystem obj with that and then go forward. Seems like some unnecessary overhead though. Thanks, Tom On Thu, Dec 8, 2011 at 2:42 PM, Frank Astier fast...@yahoo-inc.com wrote: Hi - We have two namenodes set up at our company, say: hdfs://A.mycompany.com hdfs://B.mycompany.com From the command line, I can do: Hadoop fs –ls hdfs://A.mycompany.com//some-dir And Hadoop fs –ls hdfs://B.mycompany.com//some-other-dir I’m now trying to do the same from a Java program that uses the HDFS API. No luck there. I get an exception: “Wrong FS”. Any idea what I’m missing in my Java program?? Thanks, Frank -- Jay Vyas MMSB/UCHC
Re: Hadoop MapReduce Poster
That's a great tutorial. I like the conciseness of it. Jay Vyas MMSB UCHC On Nov 1, 2011, at 1:39 AM, Prashant Sharma prashant.ii...@gmail.com wrote: Hi Mathias, I wrote a small introduction or a quick ramp up for starting out with hadoop while learning it at my institute. http://functionalprograming.files.wordpress.com/2011/07/hadoop-2.pdf thanks -P On Mon, Oct 31, 2011 at 6:44 PM, Mathias Herberts mathias.herbe...@gmail.com wrote: Hi, I'm in the process of putting together a 'Hadoop MapReduce Poster' so my students can better understand the various steps of a MapReduce job as ran by Hadoop. I intend to release the Poster under a CC-BY-NC-ND license. I'd be grateful if people could review the current draf (3) of the poster. It is available as a 200 dpi PNG here: http://www.flickr.com/photos/herberts/6298203371 Any comment welcome. Mathias.
Re: getting there (EOF exception).
Thanks! Yes i agree ... But Are you sure 8020? 8020 serves on 127.0.0.1 (rather than 0.0.0.0) ... Thus it is inaccessible to outside clients...That is very odd Why would that be the case ? Any insights ( using cloud eras hadoop vm). Sent from my iPad On Oct 30, 2011, at 11:48 PM, Harsh J ha...@cloudera.com wrote: Hey Jay, I believe this may be related to your other issues as well, but 50070 is NOT the port you want to connect to. 50070 serves over HTTP, while default port (fs.default.name), for IPC connections is 8020, or whatever you have configured. On 31-Oct-2011, at 5:17 AM, Jay Vyas wrote: Hi guys : What is the meaning of an EOF exception when trying to connect to Hadoop by creating a new FileSystem object ? Does this simply mean the system cant be read ? java.io.IOException: Call to /172.16.112.131:50070 failed on local exception: java.io.EOFException at org.apache.hadoop.ipc.Client.wrapException(Client.java:1139) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:111) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:213) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:180) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1514) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1548) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1530) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228) at sb.HadoopRemote.main(HadoopRemote.java:35) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:812) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:720) -- Jay Vyas MMSB/UCHC
Connecting to vm through java
Hi guys : im getting the dreaded org.apache.hadoop.ipc.Client$Connection handleConnectionFailure When connecting to clouderas hadoop (running in a vm) to request running a simple m/r job (from a machine outside the hadoop vm).. I've seen a lot of posts about this online, and it's also on stack overflow here : http://stackoverflow.com/questions/6997327/connecting-to-cloudera-vm-from-my-desktop Any tips on debugging Javas connection to hdfs over the network? It's not entirely clear to me how the connection is made/authenticated between the client and hadoop, for example, is a passwordless ssh file required..? I believe this error is related to authentication but am not sure the best way to test it... I have confirmed that the ip is valid And it appears that hdfs is being run and served over the right default port in the vm. Sent from my iPad