Does libhdfs c/c++ api support read/write compressed file

2013-06-03 Thread Xu Haiti
I have found somebody talks libhdfs does not support read/write gzip file at about 2010. I download the newest hadoop-2.0.4 and read hdfs.h. There is also no compressing arguments. Now I am wondering if it supports reading compressed file now? If it not, how can I make a patch for the libhdf

How to get the intermediate mapper output file name

2013-06-03 Thread samir das mohapatra
Hi all, How to get the mapper output filename inside the the mapper . or How to change the mapper ouput file name. Default it looks like part-m-0,part-m-1 etc. Regards, samir.

Please upgrade the Commons Configuration jar

2013-06-03 Thread Sebastiano Vigna
The 0.23.7 distribution of Hadoop still contains the 1.6 version of Commons Configuration. It is a version released on 2008-12-25, almost *five years* ago. Could you at least upgrade it to 1.8 (three years old)? 1.6 is giving us constant headaches as it misses a method java.lang.NoSuchMethodErro

Re: How to get the intermediate mapper output file name

2013-06-03 Thread Rahul Bhattacharjee
I think the format of the mapper and reducer split files are hard wired into hadoop code , however you can prepend something in the beginning of the filename or even a directory using multiple output format. thanks, Rahul On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra wrote: > Hi all, >

Re: How to get the intermediate mapper output file name

2013-06-03 Thread Dino Kečo
Hi Samir, File naming is defined in FileOutputFormat class and there is property mapreduce.output.basename which you can use to tweak things with file naming. Please check this code http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20

Re: How to get the intermediate mapper output file name

2013-06-03 Thread Rahul Bhattacharjee
Thanks Dino , good to know this. On Mon, Jun 3, 2013 at 3:12 PM, Dino Kečo wrote: > Hi Samir, > > File naming is defined in FileOutputFormat class and there is property > mapreduce.output.basename > which you can use to tweak things with file naming. > > Please check this code > http://grepcod

Re: How to get the intermediate mapper output file name

2013-06-03 Thread Raj K Singh
you can use *getInputFileBasedOutputFileName*(JobConf job, String name) which Generate the outfile name based on a given anme and the input file name. thanks Raj K Singh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Mon, Jun 3,

Re: Does libhdfs c/c++ api support read/write compressed file

2013-06-03 Thread Harsh J
Hi Xu, HDFS is data agnostic. It does not currently care about what form the data of the files are in - whether they are compressed, encrypted, serialized in format-x, etc.. There are hadoop-common APIs that support decompressing of supported codecs, but there are no C/C++ level implementations o

Re:

2013-06-03 Thread Azuryy Yu
can you upgrade to 1.1.2, which is also a stable release, and fixed the bug you facing now. --Send from my Sony mobile. On Jun 2, 2013 3:23 AM, "Shahab Yunus" wrote: > Thanks Harsh for the reply. I was confused too that why security is > causing this. > > Regards, > Shahab > > > On Sat, Jun 1, 2

Re: Does libhdfs c/c++ api support read/write compressed file

2013-06-03 Thread Michael Segel
Silly question... then what's meant by the native libraries when you talk about compression? On Jun 3, 2013, at 5:27 AM, Harsh J wrote: > Hi Xu, > > HDFS is data agnostic. It does not currently care about what form the > data of the files are in - whether they are compressed, encrypted, > ser

Installing Hue

2013-06-03 Thread Michael Namaiandeh
I am trying to install Hue for Hadoop but when I run the make install, I receive the following error message. Any help would be grealty appreciated. gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m -unwind-t

RE: How to start developing!

2013-06-03 Thread John Lilley
I had asked a similar question recently: First, follow the instructions in BUILDING.txt. It is a bit tedious, but if you are careful and get all of the exact versions of everything installed (don't rely on yum to get the right version), you will have a built hadoop. Second, there's the questio

HDFS interfaces

2013-06-03 Thread Mahmood Naderan
Hello, It is stated in the "HDFS architecture guide" (https://hadoop.apache.org/docs/r1.0.4/hdfs_design.html) that HDFS provides interfaces for applications to move themselves closer to where the data is located. What are these interfaces and where they are in the source code? Is there any

Re: How can we download a file using WebHDFS REST API

2013-06-03 Thread Mohammad Tariq
Hello Mustaqeem, I don't think this is possible through webHDFS, as others have said. But you could use wget(courtesy Datanode API) to download the file. Something like this : wget http://datanode:50075/streamFile/path_of_the_file Please note that you need to issue this command on the

Exposing hadoop web interface

2013-06-03 Thread jamal sasha
Hi, I have deployed hadoop into a small cluster.. Now the issue is while on job launch it does says Tracking URL: http://foobar:50030/jobdetails.jsp?jobid=job_201305241622_0047 but I cannot look at this url and look at the job status (maybe its the firewall?? proxy?? ) I can only look at my l

Re: Exposing hadoop web interface

2013-06-03 Thread Shahab Yunus
Are you able to access the main JobTrackerUI Page? http://foobar:50030/jobtracker.jsp If not then it can be a firewall/proxy issue (or correct usage of the foobar domain, maybe you need to to use IP or fqdn.) I am assuming that your jobs are completing successfully behind the scenes. Regards, Sh

RE: built hadoop! please help with next steps?

2013-06-03 Thread John Lilley
I've followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace... And they all appear. However, the problems window shows: Project 'hadoop-streaming' is missing

copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread samir das mohapatra
Dear All, Is there any way to copy the intermediate output file of the mapper into local folder after each map task complete. Right now I am using FileSystem.copyToLocalFile(hdfsLocation,localLocation); indiste the cleanup of mapper task , but it is failing . Exception file no

Re: copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread Shahab Yunus
Have you taken a look into extending the FileOutputFormat class and overriding the OutputCommitter API functionality? Regards, Shahab On Mon, Jun 3, 2013 at 5:11 PM, samir das mohapatra wrote: > Dear All, > > Is there any way to copy the intermediate output file of the mapper > into l

Re: copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread Mischa Tuffield
When you are configuring your job in what most people refer to as their "Driver" class you can simply not set a Reducer and only set a Mapper. // makes the job a map only one job.setNumReduceTasks(0); job.setMapperClass(MyFooMapper.class); Mischa On 3 Jun 2013, at 22:11, samir das mohapat

Re: copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread samir das mohapatra
Do you have any link or example ? could you please send me ? On Tue, Jun 4, 2013 at 2:53 AM, Shahab Yunus wrote: > Have you taken a look into extending the FileOutputFormat class and > overriding the OutputCommitter API functionality? > > Regards, > Shahab > > > On Mon, Jun 3, 2013 at 5:11 PM,

Re: copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread Deepak Vohra
Samir, The intermediate output of the mapper is already output to the local filesystem, not HDFS.  The temporary intermediate file path is  FileOutputFormat.getWorkOutputPath(context) From: samir das mohapatra To: user@hadoop.apache.org; user-h...@hadoop.a

Re: copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread Jay Vyas
Just for context and so that we can understand the question --- why are you needing to copy intermediate mapper output? On Jun 3, 2013, at 4:11 PM, samir das mohapatra wrote: > Dear All, > > Is there any way to copy the intermediate output file of the mapper into > local folder aft

Re: built hadoop! please help with next steps?

2013-06-03 Thread Deepak Vohra
John The following patch is related to the issue cited. https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel thanks, Deepak From: John Lilley To: "user@hadoop.apache.org" Sent: Monday, June 3,

Re:

2013-06-03 Thread Lanati, Matteo
Hi Azuryy, thanks for the update. Sorry for the silly question, but where can I download the patched version? If I look into the closest mirror (i.e. http://mirror.netcologne.de/apache.org/hadoop/common/), I can see that the Hadoop 1.1.2 version was last updated on Jan. 31st. Thanks in advance,

Re: copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread Shahab Yunus
Check out Pages 217-220 of the Hadoop: The Definitive Guide book. It has some nice explanation. Also, http://whiteycode.blogspot.com/2012/06/hadoop-removing-empty-output-files.html Plus as Jay said, explanation of your use-case might also be helpful. Regards, Shahab On Mon, Jun 3, 2013 at 5:44

RE: HDFS interfaces

2013-06-03 Thread John Lilley
Mahmood, It is the in the FileSystem interface. http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.Path, long, long)

Re: [HADOOP-Securemode]checkpoint and fsck are failing

2013-06-03 Thread Mikhail Antonov
"I got root cause,Here keytab for HTTP and NN should be same..." I have the same error, but running secured Oozie under CDH 4.2.1. Could you please give more details about your fix to the problem you had? I suspect I have something similar. Mikhail

RE: built hadoop! please help with next steps?

2013-06-03 Thread John Lilley
I am getting errors trying to install m2e… has anyone else encountered this? Cannot complete the install because one or more required items could not be found. Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601

Re:

2013-06-03 Thread Azuryy Yu
yes. hadoop-1.1.2 was released on Jan. 31st. just download it. On Tue, Jun 4, 2013 at 6:33 AM, Lanati, Matteo wrote: > Hi Azuryy, > > thanks for the update. Sorry for the silly question, but where can I > download the patched version? > If I look into the closest mirror (i.e. > http://mirror.ne

Re:

2013-06-03 Thread Harsh J
Azuryy, 1.1.2 < 1.2.0. Its not an upgrade you're suggesting there. If you feel there's been a regression, can you comment that on the JIRA? On Tue, Jun 4, 2013 at 6:57 AM, Azuryy Yu wrote: > yes. hadoop-1.1.2 was released on Jan. 31st. just download it. > > > On Tue, Jun 4, 2013 at 6:33 AM, Lana

Re:

2013-06-03 Thread Azuryy Yu
Hi Harsh, I need to take care my eyes recently, I mis-read 1.2.0 to 1.0.2, so I said upgrade. Sorry. On Tue, Jun 4, 2013 at 9:46 AM, Harsh J wrote: > Azuryy, > > 1.1.2 < 1.2.0. Its not an upgrade you're suggesting there. If you feel > there's been a regression, can you comment that on the JIRA

Re: How to get the intermediate mapper output file name

2013-06-03 Thread Serega Sheypak
See http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputFormat.html - Case two: This class is used for a map only job. The job wants to use an output file name that is either a part of the input file name of the input data, or some derivation of it. -- Case th

Re: copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread Raj K Singh
by default hadoop keep intermediate values produced by mapper in the local file system,you can get the the handle on it using FileOutputFormat.getWorkOutputPath(context) Raj K Singh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Tue

how to locate the replicas of a file in HDFS?

2013-06-03 Thread 一凡 李
Hi, Could you tell me how to locate where store each replica of a file in HDFS? Correctly speaking, if I create a file in HDFS(replicate factor:3),how to find the DataNodes which store its each block and replicas?    Best Wishes, Yifan

Re: how to locate the replicas of a file in HDFS?

2013-06-03 Thread Rahul Bhattacharjee
hadoop fsck mytext.txt -files -locations -blocks Thanks, Rahul On Tue, Jun 4, 2013 at 10:19 AM, 一凡 李 wrote: > Hi, > > Could you tell me how to locate where store each replica of a file in HDFS? > > Correctly speaking, if I create a file in HDFS(replicate factor:3),how to > find the DataNodes

Re: HDFS interfaces

2013-06-03 Thread Mahmood Naderan
There are many instances of getFileBlockLocations in hadoop/fs. Can you explain which one is the main? >It must be combined with a method of logically splitting the input data along >block boundaries, and of launching tasks on worker nodes that >are close to >the data splits Is this a user lev

Re: how to locate the replicas of a file in HDFS?

2013-06-03 Thread Mahmood Naderan
>hadoop fsck mytext.txt -files -locations -blocks I expect something like a tag which is attached to each block (say block X) that shows the position of the replicated block of X. The method you mentioned is a user level task. Am I right?   Regards, Mahmood F

HDFS edit log NPE

2013-06-03 Thread Robert Dyer
I recently upgraded from 1.0.4 to 1.1.2. Now however my HDFS won't start up. There appears to be something wrong in the edits file. Obviously I can roll back to a previous checkpoint, however it appears checkpointing has been failing for some time and my last check point is over a month old. Is

Re: HDFS interfaces

2013-06-03 Thread Jay Vyas
Looking in the source, it appears that In HDFS, the Namenode supports getting this info directly via the client, and ultimately communicates block locations to the DFSClient , which is used by the DistributedFileSystem. /** * @see ClientProtocol#getBlockLocations(String, long, long) */ s

Re: MapReduce on Local FileSystem

2013-06-03 Thread Kun Ling
Hi Agarwal, I once have similar questions, and have done some experiment. Here is my experience: 1. For some applications over MR, like HBase, Hive, which does not need to submit additional files to HDFS, file:/// could work well without any problem (According to my test). 2. For simple MR app