Re: hadoop configure issues
Hi Evan, I think this is why: 24 * 1.2g < 100g. I don't know the "huge pages" of the IBM JDK, but still you may config 16g in nodemanager. Thanks. Drake 민영근 Ph.D kt NexR On Thu, Jan 14, 2016 at 2:53 PM, yaoxiaohuawrote: > Hi guys, > > We use huge pages for linux, > > the total huge page memory is 16G. > > Our environment is > > 128G memory, > > 28 disks, > > 32(logical ) cpu > > > > Ibm jdk 1.7 > > Cdh2.3 > > Linux : overcommit 0 > > > > For one nodemanager, we give 100g total, and vcore :24. > > So I find that one nodemanager can assign 24 container at > the same time. > > And every container ‘s java opts is : > > -server -Xms1200m -Xmx1200m -Xlp -Xnoclassgc > -Xgcpolicy:gencon -Xjit:optLevel=hot > > -Xlp in ibm jdk is meaning use huge pages. > > > > My questions is that, when the cluster is busy, > > I found 24 containing is launched at same time, but we > just have 16G huge pages totoal, > > Why does this happened? 24 * 1.2g > 16G > > > > Thanks > > > > Best Regards, > > Evan > > >
Re: hadoop mapreduce job rest api
Maybe this?: http://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application Drake 민영근 Ph.D kt NexR On Thu, Dec 24, 2015 at 3:04 PM, Artem Ervitswrote: > Take a look at webhcat api > On Dec 24, 2015 12:50 AM, "ram kumar" wrote: > >> Hi, >> >> I want to submit a mapreduce job using rest api, >> and get the status of the job every n interval. >> Is there a way to do it? >> >> Thanks >> >
HDFS Datanode Blockreport includes invalid block.
Greetings, All Recently, I met the worst case of HDFS, the missing block. The timeline(log) of namenode is below: initial state: Datanode A(or B or C. not confirm yet) and 192.168.100.90 contains block_1. (Datanode A, B, C crash w/ hardware fault) 07:33:03: ask replication block_1 from 192.168.100.90 to 192.168.100.210 07:34:14: updatedBlockmap addStoredBlock 192.168.100.210 with block_1 ... (Datanode A, B, C recover) 18:11:36: block_1 of 192.168.100.210 is invalidate 18:11:45: block_1 of 192.168.100.210 is deleted ... 19:43:03: updatedBlockmap addStoredBlock 192.168.100.210 with block_1 19:43:03: block_1 of 192.168.100.90 is invalidate 19:43:04: block_1 of 192.168.100.90 is deleted ... (Datanode A, B, C crash w/ hardware fault AGAIN) 00:27:21: ask replication block_1 from 192.168.100.210 to 192.168.100.82 07:34:14: Error cause block_1 of 192.168.100.210 is invalid At 19:43:03, Datanode 192.168.100.210 send the block report to namenode. I guess 192.168.100.210's block report contains worng, in this case invalid, block. Anyone seen this problem ? Sorry for log format. I cannot get the full logs. Thanks. Drake 민영근 Ph.D kt NexR
Re: YARN container killed as running beyond memory limits
Hi, You should disable vmem check. See this: http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/ Thanks. 2015년 6월 17일 수요일, Naganarasimha G R (Naga)garlanaganarasi...@huawei.com님이 작성한 메시지: Hi, From the logs its pretty clear its due to *Current usage: 576.2 MB of 2 GB physical memory used; 4.2 GB of 4.2 GB virtual memory used. Killing container.* Please increase the value yarn.nodemanager.vmem-pmem-ratio from the default value 2 to something like 4 or 8 based on ur app and system. + Naga -- *From:* Arbi Akhina [arbi.akh...@gmail.com javascript:_e(%7B%7D,'cvml','arbi.akh...@gmail.com');] *Sent:* Wednesday, June 17, 2015 17:19 *To:* user@hadoop.apache.org javascript:_e(%7B%7D,'cvml','user@hadoop.apache.org'); *Subject:* YARN container killed as running beyond memory limits Hi, I've a YARN application that submits containers. In the AplicationMaster logs I see that the container is killed. Here is the logs: Jun 17, 2015 1:31:27 PM com.heavenize.modules.RMCallbackHandler onContainersCompleted INFO: container 'container_1434471275225_0007_01_02' status is ContainerStatus: [ContainerId: container_1434471275225_0007_01_02, State: COMPLETE, Diagnostics: Container [pid=4069,containerID=container_1434471275225_0007_01_02] is running beyond virtual memory limits. Current usage: 576.2 MB of 2 GB physical memory used; 4.2 GB of 4.2 GB virtual memory used. Killing container. Dump of the process-tree for container_1434471275225_0007_01_02 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 4094 4093 4069 4069 (java) 2932 94 2916065280 122804 /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Xms512m -Xmx2048m -XX:MaxPermSize=250m -XX:+UseConcMarkSweepGC -Dosmoze.path=/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_02/Osmoze -Dspring.profiles.active=webServer -jar /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_02/heavenize-modules.jar |- 4093 4073 4069 4069 (sh) 0 0 4550656 164 /bin/sh /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_02/startup.sh |- 4073 4069 4069 4069 (java) 249 34 1577267200 24239 /usr/lib/jvm/java-7-openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -containerId container_1434471275225_0007_01_02 -port 5369 -exe hdfs://hadoop-server/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-server/user/hadoop/heavenize/config.zip |- 4069 1884 4069 4069 (bash) 0 0 12730368 304 /bin/bash -c /usr/lib/jvm/java-7-openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -containerId container_1434471275225_0007_01_02 -port 5369 -exe hdfs://hadoop-server/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-server/user/hadoop/heavenize/config.zip 1 /usr/local/hadoop/logs/userlogs/application_1434471275225_0007/container_1434471275225_0007_01_02/stdout 2 /usr/local/hadoop/logs/userlogs/application_1434471275225_0007/container_1434471275225_0007_01_02/stderr I don't see any memory excess, any idea where this error comes from? There is no errors in the container, it just stop logging as a result of being killed. -- Drake 민영근 Ph.D kt NexR
Re: How to test DFS?
Hi, You can use 'hdfs fsck' command for determining block locations. Sample run shows below: [root@qa-b1 ~]# hdfs fsck /tmp/jack -files -blocks -locations Connecting to namenode via http://192.168.50.171:50070 FSCK started by root (auth:SIMPLE) from /192.168.50.170 for path /tmp/jack at Wed May 27 14:51:56 KST 2015 /tmp/jack 517472256 bytes, 4 block(s): OK 0. BP-1171919055-192.168.50.171-1431320286009:blk_1073742878_2054 len=134217728 repl=3 [192.168.50.174:50010, 192.168.50.172:50010, 192.168.50.173:50010] 1. BP-1171919055-192.168.50.171-1431320286009:blk_1073742879_2055 len=134217728 repl=3 [192.168.50.174:50010, 192.168.50.172:50010, 192.168.50.173:50010] 2. BP-1171919055-192.168.50.171-1431320286009:blk_1073742880_2056 len=134217728 repl=3 [192.168.50.174:50010, 192.168.50.172:50010, 192.168.50.173:50010] 3. BP-1171919055-192.168.50.171-1431320286009:blk_1073742881_2057 len=114819072 repl=3 [192.168.50.174:50010, 192.168.50.172:50010, 192.168.50.173:50010] file /tmp/jack is split by four blocks. Block 0 is replicated 3 node, 192.168.50.174, 192.168.50.172, 192.168.50.173 Thanks. Drake 민영근 Ph.D kt NexR On Wed, May 27, 2015 at 8:58 AM, jay vyas jayunit100.apa...@gmail.com wrote: you could just list the file contents in your hadoop data/ directories, of the individual nodes, ... somewhere in there the file blocks will be floating around. On Tue, May 26, 2015 at 4:59 PM, Caesar Samsi caesarsa...@mac.com wrote: Hello, How would I go about and confirm that a file has been distributed successfully to all datanodes? I would like to demonstrate this capability in a short briefing for my colleagues. Can I access the file from the datanode itself (todate I can only access the files from the master node, not the slaves)? Thank you, Caesar. -- jay vyas
Re: Hive startup error
I think some conflict in jars. add below in hive-env.sh export HADOOP_USER_CLASSPATH_FIRST=true Thanks. Drake 민영근 Ph.D kt NexR On Fri, May 15, 2015 at 7:01 AM, Ted Yu yuzhih...@gmail.com wrote: bq. java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected Looks like the jline jar on classpath is incompatible with the one Hive was built with. BTW Hive user mailing list is better place to ask this question. Cheers On Thu, May 14, 2015 at 12:02 AM, Anand Murali anand_vi...@yahoo.com wrote: Dear All: I have installed Hive 1.1.0 and try to run it and get the following error. Can somebody advise please anand_vihar@Latitude-E5540:~$ hive Logging initialized using configuration in jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-common-1.1.0.jar!/hive-log4j.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/anand_vihar/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] [ERROR] Terminal initialization failed; falling back to unsupported java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected at jline.TerminalFactory.create(TerminalFactory.java:101) at jline.TerminalFactory.get(TerminalFactory.java:158) at jline.console.ConsoleReader.init(ConsoleReader.java:229) at jline.console.ConsoleReader.init(ConsoleReader.java:221) at jline.console.ConsoleReader.init(ConsoleReader.java:209) at org.apache.hadoop.hive.cli.CliDriver.getConsoleReader(CliDriver.java:773) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:715) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Exception in thread main java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected at jline.console.ConsoleReader.init(ConsoleReader.java:230) at jline.console.ConsoleReader.init(ConsoleReader.java:221) at jline.console.ConsoleReader.init(ConsoleReader.java:209) at org.apache.hadoop.hive.cli.CliDriver.getConsoleReader(CliDriver.java:773) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:715) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Thanks Anand Murali
Re: Re: Filtering by value in Reducer
Hi, Peter The missing records, they are just gone without no logs? How about your reduce tasks logs? Thanks Drake 민영근 Ph.D kt NexR On Tue, May 12, 2015 at 5:18 AM, Peter Ruch rutschifen...@gmail.com wrote: Hello, sum and threshold are both Integers. for the threshold variable I first add a new resource to the configuration - conf.addResource( ... ); later I get the threshold value from the configuration. Code # private int threshold; public void setup( Context context ) { Configuration conf = context.getConfiguration(); threshold = conf.getInt( threshold, -1 ); } # Best, Peter On 11.05.2015 19:26, Shahab Yunus wrote: What is the type of the threshold variable? sum I believe is a Java int. Regards, Shahab On Mon, May 11, 2015 at 1:08 PM, Peter Ruch rutschifen...@gmail.com wrote: Hi, I am currently playing around with Hadoop and have some problems when trying to filter in the Reducer. I extended the WordCount v1.0 example from the 2.7 MapReduce Tutorial with some additional functionality and added the possibility to filter by the specific value of each key - e.g. only output the key-value pairs where [[ value threshold ]]. Filtering Code in Reducer # for (IntWritable val : values) { sum += val.get(); } if ( sum threshold ) { result.set(sum); context.write(key, result); } # For threshold smaller any value the above code works as expected and the output contains all key-value pairs. If I increase the threshold to 1 some pairs are missing in the output although the respective value would be larger than the threshold. I tried to work out the error myself, but I could not get it to work as intended. I use the exact Tutorial setup with Oracle JDK 8 on a CentOS 7 machine. As far as I understand the respective Iterable... in the Reducer already contains all the observed values for a specific key. Why is it possible that I am missing some of these key-value pairs then? It only fails in very few cases. The input file is pretty large - 250 MB - so I also tried to increase the memory for the mapping and reduction steps but it did not help ( tried a lot of different stuff without success ) Maybe someone already experienced similar problems / is more experienced than I am. Thank you, Peter
Re: Question about Block size configuration
Hi I think metadata size is not greatly different. The problem is the number of blocks. The block size is lesser than 64MB, more block generated with the same file size(if 32MB then 2x more blocks). And, yes. all metadata is in the namenode's heap memory. Thanks. Drake 민영근 Ph.D kt NexR On Tue, May 12, 2015 at 3:31 PM, Himawan Mahardianto mahardia...@ugm.ac.id wrote: thank you for the explanation, and how much byte each metadata will consuming in RAM if BS is 64MB or smaller than that? I heard every metadata will store on RAM right?
Re: Map Reduce Help
Hi. The mapreduce example is the case. See this: https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java Drake 민영근 Ph.D kt NexR On Wed, May 6, 2015 at 2:00 AM, Chandrashekhar Kotekar shekhar.kote...@gmail.com wrote: Technically yes, you can keep all map reduce jobs in single jar file because all map reduce jobs are nothing but java classes but I think its better to keep all map-reduce job isolated so that you will be able to modify them easily in future. Regards, Chandrash3khar Kotekar Mobile - +91 8600011455 On Tue, May 5, 2015 at 9:18 PM, Nishanth S chinchu2...@gmail.com wrote: Hello, I am very new to map reduce.We need to wirte few map reduce jobs to process different binary files.Can all the different map reduce programs be packaged into a single jar file?. Thanks, Chinchu
Re: how to load data
Hi, Jay It seems there is no jar for openCSV. Check your hive/lib/opencsv-x.y.jar. Thanks. Drake 민영근 Ph.D kt NexR On Mon, May 4, 2015 at 11:03 AM, Kumar Jayapal kjayapa...@gmail.com wrote: Hi I have created a table as you said, CREATE TABLE Seq1 ( d5whse int COMMENT 'DECIMAL(5,0) Whse', d5sdat string COMMENT 'DATE Sales Date', d5reg_num smallint COMMENT 'DECIMAL(3,0) Reg#', d5trn_num int COMMENT 'DECIMAL(5,0) Trn#', d5scnr string COMMENT 'CHAR(1) Scenario', d5areq string COMMENT 'CHAR(1) Act Requested', d5atak string COMMENT 'CHAR(1) Act Taken', d5msgc string COMMENT 'CHAR(3) Msg Code') PARTITIONED BY (FISCAL_YEAR smallint, FISCAL_PERIOD smallint) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES (separatorChar = ,, quoteChar = \) STORED AS TEXTFILE and it got successfully and I was able to insert the values into it with our , and now I have another issue I am not able to insert the values from this table to parque Seq2 INSERT INTO TABLE seq2 PARTITION (FISCAL_YEAR = 2003, FISCAL_PERIOD = 06) SELECT* FROM SEQ I get this error 2015-05-04 01:55:42,000 INFO [IPC Server handler 2 on 57009] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1430691855979_0477_m_00_1: Error: java.lang.RuntimeException: java.lang.NoClassDefFoundError: au/com/bytecode/opencsv/CSVReader at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.NoClassDefFoundError: au/com/bytecode/opencsv/CSVReader at org.apache.hadoop.hive.serde2.OpenCSVSerde.newReader(OpenCSVSerde.java:177) at org.apache.hadoop.hive.serde2.OpenCSVSerde.deserialize(OpenCSVSerde.java:147) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180) ... 8 more Caused by: java.lang Thanks Jay On Sun, May 3, 2015 at 6:57 PM, Kumar Jayapal kjayapa...@gmail.com wrote: Hi, I have created the table as you said 2015-05-04 01:55:42,000 INFO [IPC Server handler 2 on 57009] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1430691855979_0477_m_00_1: Error: java.lang.RuntimeException: java.lang.NoClassDefFoundError: au/com/bytecode/opencsv/CSVReader at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.NoClassDefFoundError: au/com/bytecode/opencsv/CSVReader at org.apache.hadoop.hive.serde2.OpenCSVSerde.newReader(OpenCSVSerde.java:177) at org.apache.hadoop.hive.serde2.OpenCSVSerde.deserialize(OpenCSVSerde.java:147) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180) ... 8 more Caused by: java.lang.ClassNotFoundException: au.com.bytecode.opencsv.CSVReader at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 14 more
Re: rolling upgrade(2.4.1 to 2.6.0) problem
Hi, IMHO, Upgrade *with downtime* after 2.7.1 is the best option left. Thanks. Drake 민영근 Ph.D kt NexR On Mon, Apr 27, 2015 at 5:46 PM, Nitin Pawar nitinpawar...@gmail.com wrote: I had read somewhere 2.7 has lots of issues so you should wait for 2.7.1 where most of them are getting addressed On Mon, Apr 27, 2015 at 2:14 PM, 조주일 tjst...@kgrid.co.kr wrote: I think heartbeat failure cause is hang of nodes. I found a bug report associated with this problem. https://issues.apache.org/jira/browse/HDFS-7489 https://issues.apache.org/jira/browse/HDFS-7496 https://issues.apache.org/jira/browse/HDFS-7531 https://issues.apache.org/jira/browse/HDFS-8051 It has been fixed in 2.7. I do not have experience patch. And Because of this stability has not been confirmed, I can not upgrade to 2.7. What do you recommend for that? How can I do the patch, if I will do patch? Can I patch without service dowtime. -Original Message- *From:* Drake민영근drake@nexr.com *To:* useruser@hadoop.apache.org; 조주일tjst...@kgrid.co.kr; *Cc:* *Sent:* 2015-04-24 (금) 17:41:59 *Subject:* Re: rolling upgrade(2.4.1 to 2.6.0) problem Hi, I think limited by max user processes. see this: https://plumbr.eu/outofmemoryerror/unable-to-create-new-native-thread In your case, user cannot create more than 10240 processes. In our env, the limit is more like 65000. I think it's worth a try. And, if hdfs datanode daemon's user is not root, set the limit file into /etc/security/limits.d Thanks. Drake 민영근 Ph.D kt NexR On Fri, Apr 24, 2015 at 5:15 PM, 조주일 tjst...@kgrid.co.kr wrote: ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 62580 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 102400 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 10240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited -- Hadoop cluster was operating normally in the 2.4.1 version. Hadoop cluster is a problem in version 2.6. E.g Slow BlockReceiver logs are often seen org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost If the data node failure and under-block occurs, another many nodes heartbeat check is fails. So, I stop all nodes and I start all nodes. The cluster is then normalized. In this regard, Hadoop Is there a difference between version 2.4 and 2.6? ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 62580 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 102400 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 10240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited -Original Message- *From:* Drake민영근drake@nexr.com *To:* useruser@hadoop.apache.org; 조주일tjst...@kgrid.co.kr; *Cc:* *Sent:* 2015-04-24 (금) 16:58:46 *Subject:* Re: rolling upgrade(2.4.1 to 2.6.0) problem HI, How about the ulimit setting of the user for hdfs datanode ? Drake 민영근 Ph.D kt NexR On Wed, Apr 22, 2015 at 6:25 PM, 조주일 tjst...@kgrid.co.kr wrote: I allocated 5G. I think OOM is not the cause of essentially -Original Message- *From:* Han-Cheol Chohancheol@nhn-playart.com *To:* user@hadoop.apache.org; *Cc:* *Sent:* 2015-04-22 (수) 15:32:35 *Subject:* RE: rolling upgrade(2.4.1 to 2.6.0) problem Hi, The first warning shows out-of-memory error of JVM. Did you give enough max heap memory for DataNode daemons? DN daemons, by default, uses max heap size 1GB. So if your DN requires more than that, it will be in a trouble. You can check the memory consumption of you DN dameons (e.g., top command) and the memory allocated to them by -Xmx option (e.g., jps -lmv). If the max heap size is too small, you can use HADOOP_DATANODE_OPTS variable (e.g., HADOOP_DATANODE_OPTS=-Xmx4g) to override it. Best wishes, Han-Cheol -Original Message- *From:* 조주일tjst...@kgrid.co.kr *To:* user@hadoop.apache.org;
Re: rolling upgrade(2.4.1 to 2.6.0) problem
HI, How about the ulimit setting of the user for hdfs datanode ? Drake 민영근 Ph.D kt NexR On Wed, Apr 22, 2015 at 6:25 PM, 조주일 tjst...@kgrid.co.kr wrote: I allocated 5G. I think OOM is not the cause of essentially -Original Message- *From:* Han-Cheol Chohancheol@nhn-playart.com *To:* user@hadoop.apache.org; *Cc:* *Sent:* 2015-04-22 (수) 15:32:35 *Subject:* RE: rolling upgrade(2.4.1 to 2.6.0) problem Hi, The first warning shows out-of-memory error of JVM. Did you give enough max heap memory for DataNode daemons? DN daemons, by default, uses max heap size 1GB. So if your DN requires more than that, it will be in a trouble. You can check the memory consumption of you DN dameons (e.g., top command) and the memory allocated to them by -Xmx option (e.g., jps -lmv). If the max heap size is too small, you can use HADOOP_DATANODE_OPTS variable (e.g., HADOOP_DATANODE_OPTS=-Xmx4g) to override it. Best wishes, Han-Cheol -Original Message- *From:* 조주일tjst...@kgrid.co.kr *To:* user@hadoop.apache.org; *Cc:* *Sent:* 2015-04-22 (수) 14:54:16 *Subject:* rolling upgrade(2.4.1 to 2.6.0) problem My Cluster is.. hadoop 2.4.1 Capacity : 1.24PB Used 1.1PB 16 Datanodes Each node is a capacity of 65TB, 96TB, 80TB, Etc.. I had to proceed with the rolling upgrade 2.4.1 to 2.6.0. A data node upgraded takes about 40 minutes. Occurs during the upgrade is in progress under-block. 10 nodes completed upgrade 2.6.0. Had a problem at some point during a rolling upgrade of the remaining nodes. Heartbeat of the many nodes(2.6.0 only) has failed. I did changes the following attributes but I did not fix the problem, dfs.datanode.handler.count = 100 --- 300, 400, 500 dfs.datanode.max.transfer.threads = 4096 --- 8000, 1 I think, 1. Something that causes a delay in processing threads. I think it may be because the block replication between different versions. 2. Whereby the many handlers and xceiver became necessary. 3. Whereby the out of memory, an error occurs. Or the problem arises on a datanode. 4. Heartbeat fails, and datanode dies. I found a datanode error log for the following: However, it is impossible to determine the cause. I think, therefore I am. Called because it blocks the replication between different versions Give me someone help me !! DATANODE LOG -- ### I had to check a few thousand close_wait connection from the datanode. org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 1207ms (threshold=300ms) 2015-04-21 22:46:01,772 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of memory. Will retry in 30 seconds. java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:145) at java.lang.Thread.run(Thread.java:662) 2015-04-21 22:49:45,378 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: datanode-192.168.1.207:40010:DataXceiverServer:java.io.IOException: Xceiver count 8193 exceeds the limit of concurrent xcievers: 8192 at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:140) at java.lang.Thread.run(Thread.java:662) 2015-04-22 01:01:25,632 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: datanode-192.168.1.207:40010:DataXceiverServer:java.io.IOException: Xceiver count 8193 exceeds the limit of concurrent xcievers: 8192 at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:140) at java.lang.Thread.run(Thread.java:662) 2015-04-22 03:49:44,125 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: datanode-192.168.1.204:40010:DataXceiver error processing READ_BLOCK operation src: /192.168.2.174:45606 dst: /192.168.1.204:40010 java.io.IOException: cannot find BPOfferService for bpid=BP-1770955034-0.0.0.0-1401163460236 at org.apache.hadoop.hdfs.server.datanode.DataNode.getDNRegistrationForBP(DataNode.java:1387) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:470) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) at java.lang.Thread.run(Thread.java:662) 2015-04-22 05:30:28,947 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.1.203, datanodeUuid=654f22ef-84b3-4ecb-a959-2ea46d817c19, infoPort=40075, ipcPort=40020,
Re: rolling upgrade(2.4.1 to 2.6.0) problem
Hi, I think limited by max user processes. see this: https://plumbr.eu/outofmemoryerror/unable-to-create-new-native-thread In your case, user cannot create more than 10240 processes. In our env, the limit is more like 65000. I think it's worth a try. And, if hdfs datanode daemon's user is not root, set the limit file into /etc/security/limits.d Thanks. Drake 민영근 Ph.D kt NexR On Fri, Apr 24, 2015 at 5:15 PM, 조주일 tjst...@kgrid.co.kr wrote: ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 62580 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 102400 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 10240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited -- Hadoop cluster was operating normally in the 2.4.1 version. Hadoop cluster is a problem in version 2.6. E.g Slow BlockReceiver logs are often seen org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost If the data node failure and under-block occurs, another many nodes heartbeat check is fails. So, I stop all nodes and I start all nodes. The cluster is then normalized. In this regard, Hadoop Is there a difference between version 2.4 and 2.6? ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 62580 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 102400 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 10240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited -Original Message- *From:* Drake민영근drake@nexr.com *To:* useruser@hadoop.apache.org; 조주일tjst...@kgrid.co.kr; *Cc:* *Sent:* 2015-04-24 (금) 16:58:46 *Subject:* Re: rolling upgrade(2.4.1 to 2.6.0) problem HI, How about the ulimit setting of the user for hdfs datanode ? Drake 민영근 Ph.D kt NexR On Wed, Apr 22, 2015 at 6:25 PM, 조주일 tjst...@kgrid.co.kr wrote: I allocated 5G. I think OOM is not the cause of essentially -Original Message- *From:* Han-Cheol Chohancheol@nhn-playart.com *To:* user@hadoop.apache.org; *Cc:* *Sent:* 2015-04-22 (수) 15:32:35 *Subject:* RE: rolling upgrade(2.4.1 to 2.6.0) problem Hi, The first warning shows out-of-memory error of JVM. Did you give enough max heap memory for DataNode daemons? DN daemons, by default, uses max heap size 1GB. So if your DN requires more than that, it will be in a trouble. You can check the memory consumption of you DN dameons (e.g., top command) and the memory allocated to them by -Xmx option (e.g., jps -lmv). If the max heap size is too small, you can use HADOOP_DATANODE_OPTS variable (e.g., HADOOP_DATANODE_OPTS=-Xmx4g) to override it. Best wishes, Han-Cheol -Original Message- *From:* 조주일tjst...@kgrid.co.kr *To:* user@hadoop.apache.org; *Cc:* *Sent:* 2015-04-22 (수) 14:54:16 *Subject:* rolling upgrade(2.4.1 to 2.6.0) problem My Cluster is.. hadoop 2.4.1 Capacity : 1.24PB Used 1.1PB 16 Datanodes Each node is a capacity of 65TB, 96TB, 80TB, Etc.. I had to proceed with the rolling upgrade 2.4.1 to 2.6.0. A data node upgraded takes about 40 minutes. Occurs during the upgrade is in progress under-block. 10 nodes completed upgrade 2.6.0. Had a problem at some point during a rolling upgrade of the remaining nodes. Heartbeat of the many nodes(2.6.0 only) has failed. I did changes the following attributes but I did not fix the problem, dfs.datanode.handler.count = 100 --- 300, 400, 500 dfs.datanode.max.transfer.threads = 4096 --- 8000, 1 I think, 1. Something that causes a delay in processing threads. I think it may be because the block replication between different versions. 2. Whereby the many handlers and xceiver became necessary. 3. Whereby the out of memory, an error occurs. Or the problem arises on a datanode. 4. Heartbeat fails, and datanode dies. I found a datanode error log for the following: However, it is impossible to determine the
Re: YARN HA Active ResourceManager failover when machine is stopped
Hi, Matt The second log file looks like node manager's log, not the standby resource manager. Thanks. Drake 민영근 Ph.D kt NexR On Fri, Apr 24, 2015 at 11:39 AM, Matt Narrell matt.narr...@gmail.com wrote: Active ResourceManager: http://pastebin.com/hE0ppmnb Standby ResourceManager: http://pastebin.com/DB8VjHqA Oppressively chatty and not much valuable info contained therein. On Apr 23, 2015, at 4:25 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: I have run into this offline with someone else too but couldn't root-cause it. Will you be able to share your active/standby ResourceManager logs via pastebin or something? +Vinod On Apr 23, 2015, at 9:41 AM, Matt Narrell matt.narr...@gmail.com wrote: I’m using Hadoop 2.6.0 from HDP 2.2.4 installed via Ambari 2.0 I’m testing the YARN HA ResourceManager failover. If I STOP the active ResourceManager (shut the machine off), the standby ResourceManager is elected to active, but the NodeManagers do not register themselves with the newly elected active ResourceManager. If I restart the machine (but DO NOT resume the YARN services) the NodeManagers register with the newly elected ResourceManager and my jobs resume. I assume I have some bad configuration, as this produces a SPOF, and is not HA in the sense I’m expecting. Thanks, mn
Re: ResourceLocalizationService: Localizer failed when running pi example
Hi, guess the yarn.nodemanager.local-dirs property is the problem. Can you provide that part of yarn-site.xml? Thanks. Drake 민영근 Ph.D kt NexR On Mon, Apr 20, 2015 at 4:27 AM, Fernando O. fot...@gmail.com wrote: yeah... there's not much there: -bash-4.1$ cd nm-local-dir/ -bash-4.1$ ll * filecache: total 0 nmPrivate: total 0 usercache: total 0 I'm using Open JDK, would that be a problem? More log: STARTUP_MSG: java = 1.7.0_75 / 2015-04-19 14:38:58,168 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: registered UNIX signal handlers for [TERM, HUP, INT] 2015-04-19 14:38:58,562 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2015-04-19 14:38:59,018 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher 2015-04-19 14:38:59,020 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher 2015-04-19 14:38:59,021 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService 2015-04-19 14:38:59,021 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServicesEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices 2015-04-19 14:38:59,022 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl 2015-04-19 14:38:59,023 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher 2015-04-19 14:38:59,054 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.ContainerManagerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl 2015-04-19 14:38:59,054 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.NodeManagerEventType for class org.apache.hadoop.yarn.server.nodemanager.NodeManager 2015-04-19 14:38:59,109 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2015-04-19 14:38:59,197 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2015-04-19 14:38:59,197 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system started 2015-04-19 14:38:59,217 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler 2015-04-19 14:38:59,217 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: per directory file limit = 8192 2015-04-19 14:38:59,227 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker 2015-04-19 14:38:59,248 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: The Auxilurary Service named 'mapreduce_shuffle' in the configuration is for class class org.apache.hadoop.mapred.ShuffleHandler which has a name of 'httpshuffle'. Because these are not the same tools trying to send ServiceData and read Service Meta Data may have issues unless the refer to the name in the config. 2015-04-19 14:38:59,248 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Adding auxiliary service httpshuffle, mapreduce_shuffle 2015-04-19 14:38:59,281 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Using ResourceCalculatorPlugin :
Re: Not able to run more than one map task
Hi, Amit. Test these: Increase yarn.nodemanager.resource.memory-mb beyond 8192. That's ok for testing. And decrease mapreduce.map.memory.mb to 256 and add yarn.nodemanager.vmem-check-enabled to false in yarn-site.xml. Thanks. Drake 민영근 Ph.D kt NexR On Sat, Apr 11, 2015 at 8:01 AM, Niels Basjes ni...@basjes.nl wrote: Just curious: what is the input for your job ? If it is a single gzipped file then that is the cause of getting exactly 1 mapper. Niels On Fri, Apr 10, 2015, 09:21 Amit Kumar amiti...@msn.com wrote: Thanks a lot Harsha for replying This problem has waster at least last one week. We tried what you suggested. Could you please take a look at the configuration and suggest if we missed c? System RAM : 8GB CPU : 4 threads each with 2 cores. # Disks : 1 MR2: mapreduce.map.memory.mb : 512 mapreduce.tasktracker.map.tasks.maximum : 4 Yarn: yarn.app.mapreduce.am.resource.mb : 512 yarn.nodemanager.resource.cpu-vcores : 4 yarn.scheduler.minimum-allocation-mb : 512 yarn.nodemanager.resource.memory-mb : 5080 Regards, Amit From: ha...@cloudera.com Date: Fri, 10 Apr 2015 10:20:24 +0530 Subject: Re: Not able to run more than one map task To: user@hadoop.apache.org You are likely memory/vcore starved in the NM's configs. Increase your yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores configs, or consider lowering the MR job memory request values to gain more parallelism. On Thu, Apr 9, 2015 at 5:05 PM, Amit Kumar amiti...@msn.com wrote: Hi All, We recently started working on Hadoop. We have setup the hadoop in pseduo distribution mode along with oozie. Every developer has set it up on his laptop. The problem is that we are not able to run more than one map task concurrently on our laptops. Resource manager is not allowing more than one task on our machine. My task gets completed if I submit it without Oozie. Oozie requires one map task for its own functioning. Actual task that oozie submit does not start. Here is my configuration -- Hadoop setup in Pseudo distribution mode -- Hadoop Version - 2.6 -- Oozie Version - 4.0.1 Regards, Amit -- Harsh J
Re: Run my own application master on a specific node in a YARN cluster
Very interesting, BTW. So you try to launch app-master with YARN Container but your own node-manager without YARN Container, Am I right? Drake 민영근 Ph.D kt NexR On Wed, Apr 1, 2015 at 3:38 PM, Dongwon Kim eastcirc...@postech.ac.kr wrote: Thanks for your input but I need to launch my own node manager (different from the Yarn NM) running on each node. (which is not explained in the original question) If I were to launch just a single master with a well-known address, ZooKeeper would be a great solution! Thanks. Dongwon Kim 2015-03-31 10:47 GMT+09:00 Drake민영근 drake@nexr.com: Hi, In these circumstances, there is no easy way to do that. Maybe use workaround. How about using zookeeper for shared storage? The app master create predefined zookeeper node when starting with current machine's IP and Clients always look for that zookeeper node for app master's location. Thanks. Drake 민영근 Ph.D kt NexR On Mon, Mar 30, 2015 at 11:04 AM, Dongwon Kim eastcirc...@postech.ac.kr wrote: Hello, First of all, I'm using Hadoop-2.6.0. I want to launch my own app master on a specific node in a YARN cluster in order to open a server on a predetermined IP address and port. To that end, I wrote a driver program in which I created a ResourceRequest object and called setResourceName method to set a hostname, and attached it to a ApplicationSubmissionContext object by callingsetAMContainerResourceRequest method. I tried several times but couldn't launch the app master on a specific node. After searching code, I found that RMAppAttemptImpl invalidates what I've set in ResourceRequest as follows: // Currently, following fields are all hard code, // TODO: change these fields when we want to support // priority/resource-name/relax-locality specification for AM containers // allocation. appAttempt.amReq.setNumContainers(1); appAttempt.amReq.setPriority(AM_CONTAINER_PRIORITY); appAttempt.amReq.setResourceName(ResourceRequest.ANY); appAttempt.amReq.setRelaxLocality(true); Is there another way to launch a container for an application master on a specific node in Hadoop-2.6.0? Thanks. Dongwon Kim
Re: Run my own application master on a specific node in a YARN cluster
Hi, In these circumstances, there is no easy way to do that. Maybe use workaround. How about using zookeeper for shared storage? The app master create predefined zookeeper node when starting with current machine's IP and Clients always look for that zookeeper node for app master's location. Thanks. Drake 민영근 Ph.D kt NexR On Mon, Mar 30, 2015 at 11:04 AM, Dongwon Kim eastcirc...@postech.ac.kr wrote: Hello, First of all, I'm using Hadoop-2.6.0. I want to launch my own app master on a specific node in a YARN cluster in order to open a server on a predetermined IP address and port. To that end, I wrote a driver program in which I created a ResourceRequest object and called setResourceName method to set a hostname, and attached it to a ApplicationSubmissionContext object by callingsetAMContainerResourceRequest method. I tried several times but couldn't launch the app master on a specific node. After searching code, I found that RMAppAttemptImpl invalidates what I've set in ResourceRequest as follows: // Currently, following fields are all hard code, // TODO: change these fields when we want to support // priority/resource-name/relax-locality specification for AM containers // allocation. appAttempt.amReq.setNumContainers(1); appAttempt.amReq.setPriority(AM_CONTAINER_PRIORITY); appAttempt.amReq.setResourceName(ResourceRequest.ANY); appAttempt.amReq.setRelaxLocality(true); Is there another way to launch a container for an application master on a specific node in Hadoop-2.6.0? Thanks. Dongwon Kim
Re: Container beyond virtual memory limits
Hi, See 6. Killing of Tasks Due to Virtual Memory Usage in http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/ Drake 민영근 Ph.D kt NexR On Sun, Mar 22, 2015 at 12:43 PM, Fei Hu hufe...@gmail.com wrote: Hi, I just test my yarn installation, and run a Wordcount program. But it always report the following error, who knows how to solve it? Thank you in advance. Container [pid=7954,containerID=container_1426992254950_0002_01_05] is running beyond virtual memory limits. Current usage: 13.6 MB of 1 GB physical memory used; 4.3 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1426992254950_0002_01_05 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 7960 7954 7954 7954 (java) 5 0 4576591872 3199 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN 1638 -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1426992254950_0002/container_1426992254950_0002_01_05/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 199.26.254.140 36542 attempt_1426992254950_0002_m_03_0 5 |- 7954 7949 7954 7954 (bash) 0 0 65421312 275 /bin/bash -c /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN 1638 -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1426992254950_0002/container_1426992254950_0002_01_05/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 199.26.254.140 36542 attempt_1426992254950_0002_m_03_0 5 1/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05/stdout 2/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05/stderr Exception from container-launch. Container id: container_1426992254950_0002_01_05 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Thanks, Fei
Re: Prune out data to a specific reduce task
Hi, If you write custom partitioner, just call them to confrim the key match with which partition. You can get the number of reduer from mapcontext.getNumReduceTasks(). then, get reducer number from Partitioner.getPartition(key, value, numReduc). Finally, just write wanted records to the reducers. Caution: In this way, the parallelism of mapreduce programming model is much broken. If you cut the records for Reducer 2, the task still up but nothing in action. Thanks. Drake 민영근 Ph.D kt NexR On Fri, Mar 13, 2015 at 11:47 PM, xeonmailinglist-gmail xeonmailingl...@gmail.com wrote: Hi, The only obstacle is to know to which partition the map output would go. 1 ~ From the map method, how can I know to which partition the output go? 2 ~ Can I call getPartition(K key, V value, int numReduceTasks) from the map function? Thanks, On 13-03-2015 03:25, Naganarasimha G R (Naga) wrote: I think Drake's comment In the map method, records would be ignored with no output.collect() or context.write(). is most valid way to do it as it will avoid further processing downstream and hence less resources would be consumed, as unwanted records are pruned at the source itself. Is there any obstacle from doing this in your map method ? Regards, Naga -- *From:* xeonmailinglist-gmail [xeonmailingl...@gmail.com] *Sent:* Thursday, March 12, 2015 22:17 *To:* user@hadoop.apache.org *Subject:* Fwd: Re: Prune out data to a specific reduce task If I use the partitioner, I must be able to tell map reduce to not execute values from a certain reduce tasks. The method public int getPartition(K key, V value, int numReduceTasks) must always return a partition. I can’t return -1. Thus, I don’ t know how to tell Mapreduce to not execute data from a partition. Any suggestion? Forwarded Message Subject: Re: Prune out data to a specific reduce task Date: Thu, 12 Mar 2015 12:40:04 -0400 From: Fei Hu hufe...@gmail.com http://mailto:hufe...@gmail.com Reply-To: user@hadoop.apache.org To: user@hadoop.apache.org Maybe you could use Partitioner.class to solve your problem. On Mar 11, 2015, at 6:28 AM, xeonmailinglist-gmail xeonmailingl...@gmail.com wrote: Hi, I have this job that has 3 map tasks and 2 reduce tasks. But, I want to excludes data that will go to the reduce task 2. This means that, only reducer 1 will produce data, and the other one will be empty, or even it doesn't execute. How can I do this in MapReduce? ExampleJobExecution.png Thanks, -- -- -- --
Re: Prune out data to a specific reduce task
In the map method, records would be ignored with no output.collect() or context.write(). Or you just delete output file from reducer 2 at the end of job. the reducer 2's result file is part-r-2. Drake 민영근 Ph.D kt NexR On Wed, Mar 11, 2015 at 9:43 PM, Fabio C. anyte...@gmail.com wrote: As far as I know the code running in each reducer is the same you specify in your reduce function, so if you know in advance the features of the data you want to ignore you can just instruct reducers to do so. If you are able to tell whether or not to keep an entry at the beginning, you can filter them out within the map function. I could think of a wordcount example where we tell the map phase to ignore all the words starting with a specific letter... What kind of data are you processing and what is the filtering condition? Anyway I'm sorry I can't help with the actual code, but I'm not really into this right now. On Wed, Mar 11, 2015 at 12:13 PM, xeonmailinglist-gmail xeonmailingl...@gmail.com wrote: Maybe the correct question is, how can I filter data in mapreduce in Java? On 11-03-2015 10:36, xeonmailinglist-gmail wrote: To exclude data to a specific reducer, should I build a partitioner that do this? Should I have a map function that checks to which reduce task the output goes? Can anyone give me some suggestion? And by the way, I really want to exclude data to a reduce task. So, I will run more than 1 reducer, even if one of them does not get input data. On 11-03-2015 10:28, xeonmailinglist-gmail wrote: Hi, I have this job that has 3 map tasks and 2 reduce tasks. But, I want to excludes data that will go to the reduce task 2. This means that, only reducer 1 will produce data, and the other one will be empty, or even it doesn't execute. How can I do this in MapReduce? [image: Example Job Execution] Thanks, -- -- -- -- -- --
Re: how to find corrupt block in java code
Hi, cho I think you may start digging from org.apache.hadoop.hdfs.tools.DFSck.java and org.apache.hadoop.hdfs.server.namenode.FsckServlet.java. Good luck! Drake 민영근 Ph.D kt NexR On Mon, Mar 2, 2015 at 3:22 PM, cho ju il tjst...@kgrid.co.kr wrote: hadoop version 2.4.1 I can find corrupt files. $HADOOP_PREFIX/bin/hdfs fsck / -list-corruptfileblocks How to find corrupt block in java code ?
Re: tracking remote reads in datanode logs
Hi, Igor The AM logs are in the Hdfs if you set log aggregation property. Otherwise, they are in the container log directory. See this: http://ko.hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/ Thanks 2015년 2월 25일 수요일, Igor Bogomolovigor.bogomo...@gmail.com님이 작성한 메시지: Hi Drake, Thanks for a pointer. AM log indeed have information about remote map tasks. But I'd like to have more low level details. Like on which node each map task was scheduled and how many bytes was read. That should be exactly in datanode log and I saw it for another job. But after I reinstall the cluster it's not there anymore :( Could you please tell the path where AM log is located (from which you copied the lines)? I found it in web interface but not as file on a disk. And nothing in /var/log/hadoop-* Thanks, Igor On Tue, Feb 24, 2015 at 1:51 AM, Drake민영근 drake@nexr.com javascript:_e(%7B%7D,'cvml','drake@nexr.com'); wrote: I found this in the mapreduce am log. 2015-02-23 11:22:45,576 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:1 ScheduledMaps:5 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0 .. 2015-02-23 11:22:46,641 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:5 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:5 ContRel:0 HostLocal:3 RackLocal:2 .. The first line says Map tasks are 5 and second says HostLocal 3 and Rack Local 2. I think the Rack Local 2 are the remote map tasks as you mentioned before. Drake 민영근 Ph.D kt NexR On Tue, Feb 24, 2015 at 9:45 AM, Drake민영근 drake@nexr.com javascript:_e(%7B%7D,'cvml','drake@nexr.com'); wrote: Hi, Igor Did you look at the mapreduce application master log? I think the local or rack local map tasks are logged in the MapReduce AM log. Good luck. Drake 민영근 Ph.D kt NexR On Tue, Feb 24, 2015 at 3:30 AM, Igor Bogomolov igor.bogomo...@gmail.com javascript:_e(%7B%7D,'cvml','igor.bogomo...@gmail.com'); wrote: Hi all, In a small cluster of 5 nodes that run CDH 5.3.0 (Hadoop 2.5.0) I want to know how many remote map tasks (ones that read input data from remote nodes) there are in a mapreduce job. For this purpose I took logs of each datanode an looked for lines with op: HDFS_READ and cliID field that contains map task id. Surprisingly, 4 datanode logs does not contain lines with op: HDFS_READ. Another 1 has many lines with op: HDFS_READ but all cliID look like DFSClient_NONMAPREDUCE_* and does not contain any map task id. I concluded there are no remote map tasks but that does not look correct. Also even local reads are not logged (because there is no line where cliID field contains some map task id). Could anyone please explain what's wrong? Why logging is not working? (I use default settings). Chris, Found HADOOP-3062 https://issues.apache.org/jira/browse/HADOOP-3062 that you have implemented. Thought you might have an explanation. Best, Igor -- Drake 민영근 Ph.D kt NexR
Re: tracking remote reads in datanode logs
I found this in the mapreduce am log. 2015-02-23 11:22:45,576 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:1 ScheduledMaps:5 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0 .. 2015-02-23 11:22:46,641 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:5 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:5 ContRel:0 HostLocal:3 RackLocal:2 .. The first line says Map tasks are 5 and second says HostLocal 3 and Rack Local 2. I think the Rack Local 2 are the remote map tasks as you mentioned before. Drake 민영근 Ph.D kt NexR On Tue, Feb 24, 2015 at 9:45 AM, Drake민영근 drake@nexr.com wrote: Hi, Igor Did you look at the mapreduce application master log? I think the local or rack local map tasks are logged in the MapReduce AM log. Good luck. Drake 민영근 Ph.D kt NexR On Tue, Feb 24, 2015 at 3:30 AM, Igor Bogomolov igor.bogomo...@gmail.com wrote: Hi all, In a small cluster of 5 nodes that run CDH 5.3.0 (Hadoop 2.5.0) I want to know how many remote map tasks (ones that read input data from remote nodes) there are in a mapreduce job. For this purpose I took logs of each datanode an looked for lines with op: HDFS_READ and cliID field that contains map task id. Surprisingly, 4 datanode logs does not contain lines with op: HDFS_READ. Another 1 has many lines with op: HDFS_READ but all cliID look like DFSClient_NONMAPREDUCE_* and does not contain any map task id. I concluded there are no remote map tasks but that does not look correct. Also even local reads are not logged (because there is no line where cliID field contains some map task id). Could anyone please explain what's wrong? Why logging is not working? (I use default settings). Chris, Found HADOOP-3062 https://issues.apache.org/jira/browse/HADOOP-3062 that you have implemented. Thought you might have an explanation. Best, Igor
Re: tracking remote reads in datanode logs
Hi, Igor Did you look at the mapreduce application master log? I think the local or rack local map tasks are logged in the MapReduce AM log. Good luck. Drake 민영근 Ph.D kt NexR On Tue, Feb 24, 2015 at 3:30 AM, Igor Bogomolov igor.bogomo...@gmail.com wrote: Hi all, In a small cluster of 5 nodes that run CDH 5.3.0 (Hadoop 2.5.0) I want to know how many remote map tasks (ones that read input data from remote nodes) there are in a mapreduce job. For this purpose I took logs of each datanode an looked for lines with op: HDFS_READ and cliID field that contains map task id. Surprisingly, 4 datanode logs does not contain lines with op: HDFS_READ. Another 1 has many lines with op: HDFS_READ but all cliID look like DFSClient_NONMAPREDUCE_* and does not contain any map task id. I concluded there are no remote map tasks but that does not look correct. Also even local reads are not logged (because there is no line where cliID field contains some map task id). Could anyone please explain what's wrong? Why logging is not working? (I use default settings). Chris, Found HADOOP-3062 https://issues.apache.org/jira/browse/HADOOP-3062 that you have implemented. Thought you might have an explanation. Best, Igor
Re: writing mappers and reducers question
I suggest Standalone mode for developing mapper or reducer. But in case of partitioner or combiner, you need to setup Pseudo-Distributed mode. Drake 민영근 Ph.D kt NexR On Fri, Feb 20, 2015 at 3:18 PM, unmesha sreeveni unmeshab...@gmail.com wrote: You can write MapReduce jobs in eclipse also for testing purpose. Once it is done u can create jar and run that in your single node or multinode. But plese note while doing in such IDE s using hadoop dependecies, There will not be input splits, different mappers etc..
Re: Name Node format error
check hadoop version across the cluster, include the client machine. Drake 민영근 Ph.D kt NexR On Sun, Feb 8, 2015 at 8:04 AM, SP sajid...@gmail.com wrote: Hi All, I see these error in my JN logs. when I am trying to setup HA. can any one help. 2015-02-07 14:32:41,220 WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 192.168.1.100:45535 got version 7 expected version 9 2015-02-07 14:35:35,244 WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 192.168.1.100:45539 got version 7 expected version 9 2015-02-07 14:43:44,390 WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 192.168.1.100:45551 got version 7 expected version 9 ~ and I am unable to format my name node. *15/02/07 14:25:13 FATAL namenode.NameNode: Exception in namenode join* *org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 successful responses:* *192.168.1.100:8485 http://192.168.1.100:8485: false* *1 exceptions thrown:* *192.168.1.102:8485 http://192.168.1.102:8485: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: callId, status; Host Details : local host is: sspnamenode.sajid.com/192.168.1.100 http://sspnamenode.sajid.com/192.168.1.100; destination host is: sspdatanode2:8485; *
Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce
Hi, How about this ? The large model data stay in HDFS but with many replications and MapReduce program read the model from HDFS. In theory, the replication factor of model data equals with number of data nodes and with the Short Circuit Local Reads function of HDFS datanode, the map or reduce tasks read the model data in their own disks. In this way, maybe use too many usage of HDFS, but the annoying partition problem will be gone. Thanks Drake 민영근 Ph.D On Thu, Jan 15, 2015 at 6:05 PM, unmesha sreeveni unmeshab...@gmail.com wrote: Is there any way.. Waiting for a reply.I have posted the question every where..but none is responding back. I feel like this is the right place to ask doubts. As some of u may came across the same issue and get stuck. On Thu, Jan 15, 2015 at 12:34 PM, unmesha sreeveni unmeshab...@gmail.com wrote: Yes, One of my friend is implemeting the same. I know global sharing of Data is not possible across Hadoop MapReduce. But I need to check if that can be done somehow in hadoop Mapreduce also. Because I found some papers in KNN hadoop also. And I trying to compare the performance too. Hope some pointers can help me. On Thu, Jan 15, 2015 at 12:17 PM, Ted Dunning ted.dunn...@gmail.com wrote: have you considered implementing using something like spark? That could be much easier than raw map-reduce On Wed, Jan 14, 2015 at 10:06 PM, unmesha sreeveni unmeshab...@gmail.com wrote: In KNN like algorithm we need to load model Data into cache for predicting the records. Here is the example for KNN. [image: Inline image 1] So if the model will be a large file say1 or 2 GB we will be able to load them into Distributed cache. The one way is to split/partition the model Result into some files and perform the distance calculation for all records in that file and then find the min ditance and max occurance of classlabel and predict the outcome. How can we parttion the file and perform the operation on these partition ? ie 1 record Distance parttition1,partition2, 2nd record Distance parttition1,partition2,... This is what came to my thought. Is there any further way. Any pointers would help me. -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/
Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce
In my suggestion, map or reduce tasks do not use distributed cache. They use file directly from HDFS with short circuit local read. Like a shared storage method, but almost every node has the data with high-replication factor. Drake 민영근 Ph.D On Wed, Jan 21, 2015 at 1:49 PM, unmesha sreeveni unmeshab...@gmail.com wrote: But stil if the model is very large enough, how can we load them inti Distributed cache or some thing like that. Here is one source : http://www.cs.utah.edu/~lifeifei/papers/knnslides.pdf But it is confusing me On Wed, Jan 21, 2015 at 7:30 AM, Drake민영근 drake@nexr.com wrote: Hi, How about this ? The large model data stay in HDFS but with many replications and MapReduce program read the model from HDFS. In theory, the replication factor of model data equals with number of data nodes and with the Short Circuit Local Reads function of HDFS datanode, the map or reduce tasks read the model data in their own disks. In this way, maybe use too many usage of HDFS, but the annoying partition problem will be gone. Thanks Drake 민영근 Ph.D On Thu, Jan 15, 2015 at 6:05 PM, unmesha sreeveni unmeshab...@gmail.com wrote: Is there any way.. Waiting for a reply.I have posted the question every where..but none is responding back. I feel like this is the right place to ask doubts. As some of u may came across the same issue and get stuck. On Thu, Jan 15, 2015 at 12:34 PM, unmesha sreeveni unmeshab...@gmail.com wrote: Yes, One of my friend is implemeting the same. I know global sharing of Data is not possible across Hadoop MapReduce. But I need to check if that can be done somehow in hadoop Mapreduce also. Because I found some papers in KNN hadoop also. And I trying to compare the performance too. Hope some pointers can help me. On Thu, Jan 15, 2015 at 12:17 PM, Ted Dunning ted.dunn...@gmail.com wrote: have you considered implementing using something like spark? That could be much easier than raw map-reduce On Wed, Jan 14, 2015 at 10:06 PM, unmesha sreeveni unmeshab...@gmail.com wrote: In KNN like algorithm we need to load model Data into cache for predicting the records. Here is the example for KNN. [image: Inline image 1] So if the model will be a large file say1 or 2 GB we will be able to load them into Distributed cache. The one way is to split/partition the model Result into some files and perform the distance calculation for all records in that file and then find the min ditance and max occurance of classlabel and predict the outcome. How can we parttion the file and perform the operation on these partition ? ie 1 record Distance parttition1,partition2, 2nd record Distance parttition1,partition2,... This is what came to my thought. Is there any further way. Any pointers would help me. -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/
Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce
Yes, almost same. I assume the most time spending part was copying model data from datanode which has model data to actual process node(tasktracker or nodemanager). How about the model data's replication factor? How many nodes do you have? If you have 4 or more nodes, you can increase replication with following command. I suggest the number equal to your datanodes, but first you should confirm the enough space in HDFS. - hdfs dfs -setrep -w 6 /user/model/data Drake 민영근 Ph.D On Wed, Jan 21, 2015 at 2:12 PM, unmesha sreeveni unmeshab...@gmail.com wrote: Yes I tried the same Drake. I dont know if I understood your answer. Instead of loading them into setup() through cache I read them directly from HDFS in map section. and for each incoming record .I found the distance between all the records in HDFS. ie if R ans S are my dataset, R is the model data stored in HDFs and when S taken for processing S1-R(finding distance with whole R set) S2-R But it is taking a long time as it needs to compute the distance. On Wed, Jan 21, 2015 at 10:31 AM, Drake민영근 drake@nexr.com wrote: In my suggestion, map or reduce tasks do not use distributed cache. They use file directly from HDFS with short circuit local read. Like a shared storage method, but almost every node has the data with high-replication factor. Drake 민영근 Ph.D On Wed, Jan 21, 2015 at 1:49 PM, unmesha sreeveni unmeshab...@gmail.com wrote: But stil if the model is very large enough, how can we load them inti Distributed cache or some thing like that. Here is one source : http://www.cs.utah.edu/~lifeifei/papers/knnslides.pdf But it is confusing me On Wed, Jan 21, 2015 at 7:30 AM, Drake민영근 drake@nexr.com wrote: Hi, How about this ? The large model data stay in HDFS but with many replications and MapReduce program read the model from HDFS. In theory, the replication factor of model data equals with number of data nodes and with the Short Circuit Local Reads function of HDFS datanode, the map or reduce tasks read the model data in their own disks. In this way, maybe use too many usage of HDFS, but the annoying partition problem will be gone. Thanks Drake 민영근 Ph.D On Thu, Jan 15, 2015 at 6:05 PM, unmesha sreeveni unmeshab...@gmail.com wrote: Is there any way.. Waiting for a reply.I have posted the question every where..but none is responding back. I feel like this is the right place to ask doubts. As some of u may came across the same issue and get stuck. On Thu, Jan 15, 2015 at 12:34 PM, unmesha sreeveni unmeshab...@gmail.com wrote: Yes, One of my friend is implemeting the same. I know global sharing of Data is not possible across Hadoop MapReduce. But I need to check if that can be done somehow in hadoop Mapreduce also. Because I found some papers in KNN hadoop also. And I trying to compare the performance too. Hope some pointers can help me. On Thu, Jan 15, 2015 at 12:17 PM, Ted Dunning ted.dunn...@gmail.com wrote: have you considered implementing using something like spark? That could be much easier than raw map-reduce On Wed, Jan 14, 2015 at 10:06 PM, unmesha sreeveni unmeshab...@gmail.com wrote: In KNN like algorithm we need to load model Data into cache for predicting the records. Here is the example for KNN. [image: Inline image 1] So if the model will be a large file say1 or 2 GB we will be able to load them into Distributed cache. The one way is to split/partition the model Result into some files and perform the distance calculation for all records in that file and then find the min ditance and max occurance of classlabel and predict the outcome. How can we parttion the file and perform the operation on these partition ? ie 1 record Distance parttition1,partition2, 2nd record Distance parttition1,partition2,... This is what came to my thought. Is there any further way. Any pointers would help me. -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/
Re: hadoop yarn
Hi siva, MR program is almost same for both MR1 or MR2. Just another framework needed to run the program. If your previous program was written with new API(org.apache.hadoop.mapreduce packages), just re-complie with hadoop 2 libs. Maybe some errors/depricated methods are popped up, but not critical. Wish your luck. Thanks. Drake Min Drake 민영근 Ph.D On Tue, Jan 20, 2015 at 4:09 PM, siva kumar siva165...@gmail.com wrote: Thanks Rohit. Do we have any examples on MR2 other than wordcount, bcoz i dnt find much difference for word count example for both MR1 and MR2. Im new to yarn, so if you suggest me any example programs on MR2 it could help me out in a better way. Thanks and regards, siva On Tue, Jan 20, 2015 at 11:45 AM, Rohith Sharma K S rohithsharm...@huawei.com wrote: Refer below link, http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html Thanks Regards Rohith Sharma K S *From:* siva kumar [mailto:siva165...@gmail.com] *Sent:* 20 January 2015 11:24 *To:* user@hadoop.apache.org *Subject:* hadoop yarn Hi All, Can anyone suggest me few links for writing MR2 program on Yarn ? Thanks and regrads, siva