Re: YARN: LocalResources and file distribution

2013-12-05 Thread omkar joshi
add this file in the files to be localized. (LocalResourceRequest). and then refer it as ./list.ksh .. While adding this to LocalResource specify the path which you have mentioned. On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri < write2kish...@gmail.com> wrote: > Hi Arun, > > I have

Re: YARN: LocalResources and file distribution

2013-12-05 Thread Krishna Kishore Bonagiri
Hi Arun, I have copied a shell script to HDFS and trying to execute it on containers. How do I specify my shell script PATH in setCommands() call on ContainerLaunchContext? I am doing it this way String shellScriptPath = "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh"; command

How can Hive handle the complex data Type through SerDe and UDF/GenericUDF?

2013-12-05 Thread Baron Tsai
Table Define CREATE TABLE kvpair ( id STRING, arrstr ARRAY, arrmap ARRAY> ) ROW FORMAT SERDE "com.cloudera.hive.serde.JSONSerDe"; ##com.cloudera.hive.serde.JSONSerDe is a SerDe can handle complex json data. ---

Re: Writing to remote HDFS using C# on Windows

2013-12-05 Thread Fengyun RAO
Thanks! I tried WebHDFS, which also work well if I copy local files to HDFS, but still can't find a way to open a filestream, and write to it. 2013/12/6 Vinod Kumar Vavilapalli > You can try using WebHDFS. > > Thanks, > +Vinod > > > On Thu, Dec 5, 2013 at 6:04 PM, Fengyun RAO wrote: > >> Hi,

How to convert SequenceFile to HFile?

2013-12-05 Thread Igor Gatis
I have a bunch of SequenceFiles which I'd like to convert to HFiles. How do I do that?

Re: error in copy from local file into HDFS

2013-12-05 Thread ch huang
hi: you are right,my DN disk is full,i delete some file,now it's worked ,thanks On Fri, Dec 6, 2013 at 11:28 AM, Vinayakumar B wrote: > Hi Ch huang, > > > >Please check whether all datanodes in your cluster have enough disk > space and number non-decommissioned nodes should be non-zero

Re: Container [pid=22885,containerID=container_1386156666044_0001_01_000013] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memo

2013-12-05 Thread Vinod Kumar Vavilapalli
Something looks really bad on your cluster. The JVM's heap size is 200MB but its virtual memory has ballooned to a monstrous 332GB. Does that ring any bell? Can you run regular java applications on this node? This doesn't seem related to YARN per-se. +Vinod Hortonworks Inc. http://hortonworks.com/

Re: Monitor network traffic in hadoop

2013-12-05 Thread ch huang
hi, Abdul Navaz: assign shuffle port in each NM using option "mapreduce.shuffle.port" in mapred-site.xml,then monitor this port use tcpdump or wireshark ,hope this info can help you On Fri, Dec 6, 2013 at 11:22 AM, navaz wrote: > Hello > > I am following the tutorial hadoop on single node cluste

RE: error in copy from local file into HDFS

2013-12-05 Thread Vinayakumar B
Hi Ch huang, Please check whether all datanodes in your cluster have enough disk space and number non-decommissioned nodes should be non-zero. Thanks and regards, Vinayakumar B From: ch huang [mailto:justlo...@gmail.com] Sent: 06 December 2013 07:14 To: user@hadoop.apache.org Subject: error

Re: Writing to remote HDFS using C# on Windows

2013-12-05 Thread Vinod Kumar Vavilapalli
You can try using WebHDFS. Thanks, +Vinod On Thu, Dec 5, 2013 at 6:04 PM, Fengyun RAO wrote: > Hi, All > > Is there a way to write files into remote HDFS on Linux using C# on > Windows? We want to use HDFS as data storage. > > We know there is HDFS java API, but not C#. We tried SAMBA for file

Re: how many job request can be in queue when the first MR JOB is blocked due to lack of resource?

2013-12-05 Thread ch huang
i search the code only src/hadoop-mapreduce1-project/src/contrib/capacity-scheduler/src/test/org/apache/hadoop/mapred/TestCapacitySchedulerConf.java file has the variables,i did see it on ./src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/a

Monitor network traffic in hadoop

2013-12-05 Thread navaz
Hello I am following the tutorial hadoop on single node cluster and I am able test word count program map reduce. its working fine. I would like to know How to monitor when shuffle phase network traffic occurs via wireshark or someother means. Pls guide me. Thanks Abdul Navaz Graduate student

issue about terasort read partition file from local fs instead HDFS

2013-12-05 Thread ch huang
hi,maillist: i try to use terasort to benchmark my cluster ,when i run it ,i found tearsort try to read partition file from local filesystem not HDFS,i see a partition file in HDFS ,when i copy this file into local filesystem,run terasort again ,it's work fine ,but it run on local hos

MapReduce Job running Problems with queue designation in fair scheduler for Yarn-2.2.0.

2013-12-05 Thread #TANG SHANJIANG#
Hi, I encounter a problem with Yarn's fair scheduler. The thing is that, I first set a queue by configuring fair-scheduler.xml below. Next I try to submit a job to that queue by designating queue name via "mapreduce.job.queuename= amelie". fair-scheduler.xml: 1 mb,1vcores 900

Re: how many job request can be in queue when the first MR JOB is blocked due to lack of resource?

2013-12-05 Thread rtejac
You can take a look at this parameter. This will control number of jobs a user can initialize. mapred.capacity-scheduler.queue.default.maximum-initialized-jobs-per-user = …. On Dec 5, 2013, at 5:33 PM, ch huang wrote: > hi,maillist: > any variables can control it?

Writing to remote HDFS using C# on Windows

2013-12-05 Thread Fengyun RAO
Hi, All Is there a way to write files into remote HDFS on Linux using C# on Windows? We want to use HDFS as data storage. We know there is HDFS java API, but not C#. We tried SAMBA for file sharing and FUSE for mounting HDFS. It worked if we simply copy files to HDFS, but if we open a filestream

error in copy from local file into HDFS

2013-12-05 Thread ch huang
hi,maillist: i got a error when i put local file into HDFS [root@CHBM224 test]# hadoop fs -copyFromLocal /tmp/aa /alex/ 13/12/06 09:40:29 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /alex/aa._COPYING_ could only be repl

how many job request can be in queue when the first MR JOB is blocked due to lack of resource?

2013-12-05 Thread ch huang
hi,maillist: any variables can control it?

Re: Container [pid=22885,containerID=container_1386156666044_0001_01_000013] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memo

2013-12-05 Thread YouPeng Yang
Hi Have your spread you config over your cluster. And do you take a look whether the error containers are on any concentrated nodes? regards 2013/12/5 panfei > Hi YouPeng, thanks for your advice. I have read the docs and configure the > parameters as follows: > > Physical Server: 8 core

RE: Hadoop-common-2.2.0 cannot find PlatformName

2013-12-05 Thread Su, Xiandong
I am running it remotely from the same PC where HDP sandbox is installed. This is a NOT a hadoop job yet. It is a simple HBase client . the exception is thrown when creating HBaseAdmin based on the configuration. The reason I am asking the question on the Hadoop user list is because the NoClass

Re: Hadoop-common-2.2.0 cannot find PlatformName

2013-12-05 Thread Ted Yu
How did you launch your java client ? Take a look at the sample command in this section: http://hbase.apache.org/book.html#trouble.mapreduce.local Cheers On Fri, Dec 6, 2013 at 7:11 AM, Su, Xiandong wrote: > I am trying to have a simple java client to connect to HBase in HDP 2.0. > I am usin

For the Newbies

2013-12-05 Thread Ravishankar Nair
Hi all, A very clear presentation given by my colleague to some audience on Hadoop last week. This has made many of the people attended to quickly adapt Hadoop. Hopefully it should help you as well, enjoy. Preferably the executives attended are started appreciating Hadoop!! Thanks Golu, my firend

Hadoop-common-2.2.0 cannot find PlatformName

2013-12-05 Thread Su, Xiandong
I am trying to have a simple java client to connect to HBase in HDP 2.0. I am using maven to manage my dependencies and using the same version for the jars as HDP 2.0 specified: org.apache.hadoop hadoop-client 2.2.0 org.apache.hbase hbase-client 0.96.0-hadoop2 I am getting the NoClassDefFoundE

Re: Debugging/Modifying HDFS from Eclipse

2013-12-05 Thread Jing Zhao
MiniDFSCluster is used everywhere in HDFS's unit tests. You can easily find examples in the source code of HDFS (e.g., org.apache.hadoop.hdfs.TestDFSMkdirs). You can also test simple HA setup using MiniDFSCluster (e.g., org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode). On Thu, Dec 5, 2013

Re: Check compression codec of an HDFS file

2013-12-05 Thread alex bohr
The SequenceFile.Reader will work PErfect! (I should have seen that). As always - thanks Harsh On Thu, Dec 5, 2013 at 2:22 AM, Harsh J wrote: > If you're looking for file header/contents based inspection, you could > download the file and run the Linux utility 'file' on the file, and it > sho

Re: Debugging/Modifying HDFS from Eclipse

2013-12-05 Thread Adam Kawa
One blog post is here: http://grepalex.com/2012/10/20/hadoop-unit-testing-with-minimrcluster/ When I was playing with miniDFSCluster, and miniMRCluster, I was using them via HBaseTestingUtility (it can take a configuration object in a constructor http://people.apache.org/~psmith/hbase/sandbox/hbas

Re: issue about capacity scheduler

2013-12-05 Thread Adam Kawa
The heap of application master is controlled via yarn.app.mapreduce.am.command-opts and its default value is -Xmx1024m ( http://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml ). yarn.scheduler.minimum-allocation-mb is completely different prop

Re: mapreduce.jobtracker.expire.trackers.interval no effect

2013-12-05 Thread Adam Kawa
> So i tried the deprecated parameter mapred.tasktracker.expiry.interval in > my configuration and voila it works! > Hansi, this is exactly the one parameter that I told you about in a previous post ;)

using yarn.nodemanager.container-monitor.resource-calculator.class option in Yarn

2013-12-05 Thread ricky l
Hi all, In the hadoop-3.0.0-SNAPSHOT I set the below option hoping that it will throttle a container that over-utilize its resources. yarn.nodemanager.container-monitor.resource-calculator.class org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin If I start a nodemanager with the a

UNSUBSCRIBE

2013-12-05 Thread Veera Prasad Nallamilli
Regards Veera Prasad Nallamilli Sr. Systems Consultant - Americas Energy Group Direct: +1 (281) 414-7230 veeraprasad.nallami...@openlink.com vpra...@olf.com www.openlink.com New York * London * Houston * Berlin * Vienna * Sydney * São Paulo * Singapore * Toronto * Mosc

Re: Apache Ambari

2013-12-05 Thread Jilal Oussama
Ok, thank you.

Re: Apache Ambari

2013-12-05 Thread Chris Embree
Unless something has recently changed, Ambari cannot work on an existing cluster. One of the several reasons we chose to eschew it. On 12/5/13, Jilal Oussama wrote: > Hello all, > > Pardon me to ask this question here instead of the Ambari mailing list (I > am not subscribed to it). > > I would

Apache Ambari

2013-12-05 Thread Jilal Oussama
Hello all, Pardon me to ask this question here instead of the Ambari mailing list (I am not subscribed to it). I would like to know if you can install Amabri on a running cluster or does it have to be a "fresh" one. Thank you.

RE: Ant BuildException error building Hadoop 2.2.0

2013-12-05 Thread java8964
It looks like failing in the cmake to build native code of hadoop-common. You need to find out the cmake output to identify the root cause. Yong Date: Thu, 5 Dec 2013 08:52:52 +0100 Subject: Re: Ant BuildException error building Hadoop 2.2.0 From: silvi.ca...@gmail.com To: user@hadoop.apache.org

Re: Implementing and running an applicationmaster

2013-12-05 Thread Rob Blah
Hi There is a way but it's not an easy one. You should overwrite the container request code in MR_AM. As each container in MapReduce gets the same amount of memory, the OOM shouldn't be problem as inner task "buffers" can be spilled to disk. I am no MapReduce (code) specialist but I would start by

Re: Debugging/Modifying HDFS from Eclipse

2013-12-05 Thread Karim Awara
Hi, Is there any source on how to use the miniDFSCluster (e.g. providing configuration ..etc) ? -- Best Regards, Karim Ahmed Awara On Tue, Dec 3, 2013 at 6:14 PM, Gaurav Sharma wrote: > You can use the minidfscluster for local testing and write more tests > around whatever functionality it is

Re: Implementing and running an applicationmaster

2013-12-05 Thread Yue Wang
Hi, Thank you for your answer. Now I understand the connection between the two ways. I asked this question because I want to take benefit from the YARN architecture. If I understood correctly, I can let my ApplicationMaster request containers more flexibly. For example, I can request two containe

Decision Tree implementation not working in cluster !

2013-12-05 Thread unmesha sreeveni
Desicion tree is working perfectly in Eclipse Juno. But wen i tried to run that in my cluster it is showing error == *In main() * *In main() +++ run * *13/12/05 16:10:40 WARN mapred.JobClient: Use GenericOptionsParser for pars

Re: Check compression codec of an HDFS file

2013-12-05 Thread Harsh J
If you're looking for file header/contents based inspection, you could download the file and run the Linux utility 'file' on the file, and it should tell you the format. I don't know about Snappy (AFAIK, we don't have a snappy frame/container format support in Hadoop yet, although upstream Snappy

Found checksum error

2013-12-05 Thread unmesha sreeveni
I am trying to run Decision Tree explained in http://btechfreakz.blogspot.in/2013/04/implementation-of-c45-algorithm-using.html But while running i am getting Checksum error 13/12/05 15:23:12 INFO fs.FSInputChecker: Found checksum error: b[0, 512]=3320547275652059657320330a30204f766572636173742

Hadoop Multi-tenant cluster setUp

2013-12-05 Thread Manisha Sethi
Hi, I would like to know 1. How Hadoop 2 provides Multi-tenancy using scheduler's or in simple terms "What are the steps to configure a multi-tenant hadoop cluster?" And by here multi-tenancy means different users can run there applications(similar/different) in a way such that each us

Re: Implementing and running an applicationmaster

2013-12-05 Thread Rob Blah
Hi If I understood you correctly, you would like to run your AM with YARN Client from shell as oppose to run the Driver like in MRv1. But it's the same thing (more or less). In the example you provided (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is the "driver". However

Re: Container [pid=22885,containerID=container_1386156666044_0001_01_000013] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memo

2013-12-05 Thread panfei
Hi YouPeng, thanks for your advice. I have read the docs and configure the parameters as follows: Physical Server: 8 cores CPU, 16GB memory. For YARN: yarn.nodemanager.resource.memory-mb set to 12GB and keep 4GB for the OS. yarn.scheduler.minimum-allocation-mb set to 2048M as the minimum alloc