location of hadoop2.x api doc

2015-05-12 Thread lujinhong
Hi, I can’t find the api documentation of some Class in hadoop 2.x, such as RunJar, YarnChild and so on. I found this api doc in hadoop 1.x, for example: http://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/util/RunJar.html

Cannot initialize cluser issue - Why jobclient-tests jar is needed?

2015-05-12 Thread rab ra
Hello In one of my use case, i am running a hadoop job using the following command java -cp /etc/hadoop/conf .class This command gave some error that "cannot initialize cluster. please check the configuration for mapreduce.framework.name and the correspond server address" i understand that i n

Re: Smaller block size for more intense jobs

2015-05-12 Thread Harshit Mathur
Hi Marko, If your files are very small (less than the block size) then a lot of map tasks will get executed, but as the initialization and overheads degrades the overall performance, so it might appear that the single map is executing very fast but the overall job execution will take more time. I

RE: Lost mapreduce applications displayed in UI

2015-05-12 Thread Rohith Sharma K S
Hi, Do you remember the steps when applications won’t be displayed in RM web UI? I mean after which actions in the RM web UI applications are not displaying? Is there any filtering is applied in the UI like “Showing 0 to 0 of 0 entries (filtered from 4 total entries)” in the bottom of RM appli

Re: Re: Re: Re: Filtering by value in Reducer

2015-05-12 Thread Drake민영근
Hi, Did you try mapreduce local mode with smaller input data? Or write test case with MRUnit is very helpful for debugging. Thanks. Drake 민영근 Ph.D kt NexR On Tue, May 12, 2015 at 11:23 PM, Peter Ruch wrote: > Hi, > > No, I did not create any custom logs, I was only looking through the > "sta

Re: Lost mapreduce applications displayed in UI

2015-05-12 Thread Zhijie Shen
?Maybe you have hit the completed app limit (1 by default). Once the limit hits, the oldest completed app will be removed from cache. - Zhijie From: hitarth trivedi Sent: Tuesday, May 12, 2015 3:32 PM To: user@hadoop.apache.org Subject: Lost mapreduce appli

Lost mapreduce applications displayed in UI

2015-05-12 Thread hitarth trivedi
Hi, My cluster suddenly stopped displaying application information in UI ( http://localhost:8088/cluster/apps). Although the counters like 'Apps Submitted' , 'Apps Completed', 'Apps Running' etc, all seems to increment accurately and display right information, whnever I start new mapreduce job.

Re: distcp fails with s3n or s3a in 2.6.0

2015-05-12 Thread Stephen Armstrong
Thanks Chris, I don't know why I couldn't find that e-mail chain, but the "mapreduce.application.classpath" property is what I needed to change. Thanks for the help. Steve On Mon, May 11, 2015 at 9:59 PM, Chris Nauroth wrote: > Hello Steve, > > There was a similar discussion about this on th

Smaller block size for more intense jobs

2015-05-12 Thread marko.dinic
Hello, I'm in doubt should I specify the block size to be smaller than 64MB in case that my mappers need to do intensive computations? I know that it is better to have larger files, since the replication and NameNode as a weak point, but I'm don't have that much data, but the operations that

suppress empty part files output

2015-05-12 Thread Shushant Arora
while using multipleoutputs we use , LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); to suppress default empty part files from reducers. Whats the syntax of using this in action of oozie ?or in in xml file for ToolsRuner. Is mapreduce.output.lazyoutputformat.outputformat? t

Re: Pig 0.14.0 on Hadoop 2.6.0 deprecation errors

2015-05-12 Thread Prashant Kommireddi
Something that needs correction, just that no one has gotten around to doing it. Please feel free to open a JIRA, even better if you would like to contribute a fix. On Tuesday, May 12, 2015, Anand Murali wrote: > Oliver: > > Many thanks for reply. If it is not an error why is the info repeated >

Re: How to access value of variable in Driver class which has been declared and modified inside Mapper class?

2015-05-12 Thread Shahab Yunus
Here are some examples of how to use custom counters: http://www.ashishpaliwal.com/blog/2012/05/hadoop-recipe-using-custom-java-counters/ Regards, Shahab On May 12, 2015 1:29 PM, "Shahab Yunus" wrote: > Better options than using static variable are, imo: > > One option it use Counters. Check tha

Re: How to access value of variable in Driver class which has been declared and modified inside Mapper class?

2015-05-12 Thread Shahab Yunus
Better options than using static variable are, imo: One option it use Counters. Check that API. We are using that for values that are numeric and we need those in the driver once the job finishes. You can create your custom counters too. Other option is (if you need more than just one value or yo

How to access value of variable in Driver class which has been declared and modified inside Mapper class?

2015-05-12 Thread Answer Agrawal
Hi, I declared a variable and incremented/modified it inside Mapper class. Now I need to use the modified value of that variable in Driver class. I declared a static variable inside Mapper class and its modified value works in Driver class when I run the code in Eclipse IDE. But after creating tha

Re: URI missing scheme and authority in job start with new FileSystem implementation

2015-05-12 Thread Silvan Kaiser
Hi Varun, hi List! Just a small success feedback note: It took me quite a while but in the end i found out that not mine but AbstractFileSystem.java's resolvePath() method was used, sigh. Solution was simply to add an override in the DelegateToFileSystem impl, this override explicitely calls fsImpl

Re: Re: Re: Re: Filtering by value in Reducer

2015-05-12 Thread Peter Ruch
Hi, No, I did not create any custom logs, I was only looking through the "standard" logs. I just started out with Hadoop and did not think of explicitly logging that part of the code, as I thought that I am simply missing a small detail that someone of you might spot. But I will definitely l

Re: Re: Re: Filtering by value in Reducer

2015-05-12 Thread Shahab Yunus
Have you tried explicitly printing or logging in you reducer around the code that compares and then outputs the values? Maybe that will give you a clue that what is happening? Debug the threshold value that you get in the reducer and whether that is what you have set or not (in case of when you set

Re: Re: Re: Filtering by value in Reducer

2015-05-12 Thread Peter Ruch
Hi, I already skimmed through the logs but I could not find anything special. I am just really confused why I am having this problem. If the Iterable<...> for a specific key contains all of the observed values - and it seems to do so otherwise the program wouldn't work correctly in the standar

Re: output directory in Pig

2015-05-12 Thread Ted Yu
Looks like a question for pig mailing list: http://pig.apache.org/mailing_lists.html#Users Cheers > On May 12, 2015, at 4:14 AM, Anand Murali wrote: > > Dear All: > > I am running pig 0.14.0 on hadoop 2.6 pseudo mode. I would like to know, > where I can set job output path, such that I can m

Re: Reading a sequence file from distributed cache

2015-05-12 Thread Marko Dinic
Dear Shahab, Thanks, I didn't understand that. Now I get it. Best regards, Marko On Tue 12 May 2015 01:38:52 PM CEST, Shahab Yunus wrote: getLocalCacheFiles is deprecated and can only access files that were downloaded locally to the node running the task. Use of getCacheFiles is encouraged no

Re: Reading a sequence file from distributed cache

2015-05-12 Thread Shahab Yunus
getLocalCacheFiles is deprecated and can only access files that were downloaded locally to the node running the task. Use of getCacheFiles is encouraged now which downloads using a URI. Have you seen this? http://stackoverflow.com/questions/26492964/are-getcachefiles-and-getlocalcachefiles-the-sa

output directory in Pig

2015-05-12 Thread Anand Murali
Dear All: I am running pig 0.14.0 on hadoop 2.6 pseudo mode.  I would like to know, where I can set job output path, such that I can manage output files. Reply most welcome. Thanks Regards  Anand Murali  

Re: Reading a sequence file from distributed cache

2015-05-12 Thread Marko Dinic
Hello, I have used getCacheFiles() instead of getLocalCacheFiles() and now it works. Can someone please explain the difference between the two? I'm not able to find some good explanation about it to understand how it works. Thanks, Marko On 05/11/2015 11:25 PM, marko.di...@nissatech.com wr

Re: Question about Block size configuration

2015-05-12 Thread Drake민영근
Hi I think metadata size is not greatly different. The problem is the number of blocks. The block size is lesser than 64MB, more block generated with the same file size(if 32MB then 2x more blocks). And, yes. all metadata is in the namenode's heap memory. Thanks. Drake 민영근 Ph.D kt NexR On Tue

Re: Re: Filtering by value in Reducer

2015-05-12 Thread Drake민영근
Hi, Peter The missing records, they are just gone without no logs? How about your reduce tasks logs? Thanks Drake 민영근 Ph.D kt NexR On Tue, May 12, 2015 at 5:18 AM, Peter Ruch wrote: > Hello, > > sum and threshold are both Integers. > for the threshold variable I first add a new resource to t

Execute an external command with Hadoop 2.6.0

2015-05-12 Thread Pasquale Salza
Hi there, I have a Hadoop 2.6.0 cluster running on CentOS, Hortonworks distribution. I'm trying to execute an external command within a Mapper execution, but I did't manage to invoke a script neither with Shell.ShellCommandExecutor nor with ProcessBuilder. It is like it can't read from the host loc

Re: namenode uestion

2015-05-12 Thread Harsh J
Unless you turn on "dfs.client.use.datanode.hostname", the NN will always use IPs to denote replica location addresses. On Sun, May 10, 2015 at 9:41 PM, Pravin Sinha wrote: > Hi Asanjar, > > My understanding is that it returns serialized BlockLocation instances which > holds the location informat

Re: Reading a sequence file from distributed cache

2015-05-12 Thread Marko Dinic
Hello Shahab, I'm using 1.2.1 in pseudo-distributed mode and the same code on a cluster with 0.20.2, but I'm having same problem in both cases. I'm hopping that 1.2.1 code is back-compatible with 0.20.2 cluster? Do you have any idea what could be the problem? And what do you mean by - Have y

Re: Pig 0.14.0 on Hadoop 2.6.0 deprecation errors

2015-05-12 Thread Anand Murali
Oliver: Many thanks for reply. If it is not an error why is the info repeated again and again. All defaults have been set in *.xml, which if it is picking, why mention deprecation, which I don't understand. Regards Anand Sent from my iPhone > On 12-May-2015, at 1:10 pm, Olivier Renault wrot

Re: Pig 0.14.0 on Hadoop 2.6.0 deprecation errors

2015-05-12 Thread Olivier Renault
You don't have an error. You are seeing normal info messages. Thanks, Olivier _ From:Anand Murali Subject:Pig 0.14.0 on Hadoop 2.6.0 deprecation errors To:user@hadoop.apache.org" Hi All: I have installed above and made corresponding changes to core-site,hdfs-site an

Pig 0.14.0 on Hadoop 2.6.0 deprecation errors

2015-05-12 Thread Anand Murali
Hi All: I have installed above and made corresponding changes to core-site,hdfs-site and mapred-site.xml and still get deprecation error. On system startup I run . .hadoop export HADOOP_HOME=/home/anand_vihar/hadoop-2.6.0 export JAVA_HOME=/home/anand_vihar/jdk1.7.0_75/ export HADOOP\_PREFIX=/home