Hi Ajay,
Try SequenceFileAsBinaryInputFormat ?
Thanks
Rekha
On 11/09/12 11:24 AM, Ajay Srivastava ajay.srivast...@guavus.com wrote:
Hi,
I am using default inputFormat class for reading input from text files
but the input file has some non utf-8 characters.
I guess that TextInputFormat class
Hi,
What happens when an existing (not new) datanode rejoins a cluster for
following scenarios:
1. Some of the blocks it was managing are deleted/modified?
2. The size of the blocks are now modified say from 64MB to 128MB?
3. What if the block replication factor was one
Hi Mehul
Some of the blocks it was managing are deleted/modified?
The namenode will asynchronously replicate the blocks to other datanodes
in order to maintain the replication factor after a datanode has not
been in contact for 10 minutes.
The size of the blocks are now modified say
Mehul,
Let me make an addition.
Some of the blocks it was managing are deleted/modified?
Blocks that are deleted in the interim will deleted on the rejoining
node as well, after it rejoins . Regarding the modified, I'd advise
against modifying blocks after they have been fully written.
George has answered most of these. I'll just add on:
On Tue, Sep 11, 2012 at 12:44 PM, Mehul Choube
mehul_cho...@symantec.com wrote:
1. Some of the blocks it was managing are deleted/modified?
A DN runs a block report upon start, and sends the list of blocks to
the NN. NN validates them
Hi Jason,
Mehmet said is exactly correct ,without reducers we cannot
increase performance please you can add mappers and reducers in any
processing data you can get output and performance is good.
Thanks Regards,
Ramesh.Narasingu
On Tue, Sep 11, 2012 at 9:31 AM, Mehmet
Rekha,
I guess that problem is that Text class uses utf-8 encoding and one can not set
other encoding for this class.
I have not seen any other Text like class which supports other encoding
otherwise I have written my custom input format class.
Thanks for your inputs.
Regards,
Ajay
Hi Harsh,
Thanks for your reply. And I am sorry for my unclear description.
As I mentioned previous, I think I configured the fairsheduler correctly in
hadoop-0.22.0.
But when I commit lots of the jobs:
many big jobs (map number and reduce number is bigger than the
map/reduce slot) commit
Hi,
What happens when an existing (not new) datanode rejoins a cluster for
following scenarios:
a) Some of the blocks it was managing are deleted/modified?
b) The size of the blocks are now modified say from 64MB to 128MB?
c) What if the block replication factor was one (yea not in most
Changing the hostname to lowercase fixed this particular problem - thanks for
your replies.
The build is failing elsewhere now, I'll post a new thread for that.
Tony
From: Tony Burton [mailto:tbur...@sportingindex.com]
Sent: 10 September 2012 10:44
To: user@hadoop.apache.org
Subject: RE:
Hi Vinod,
Please check whether input file location and output file
location doesnt match. please find your input file first put into HDFS and
then run MR job it is working fine.
Thanks Regards,
Ramesh.Narasingu
On Tue, Sep 11, 2012 at 4:23 AM, Vinod Kumar Vavilapalli
Hi,
Please find i think one command is there then only build the all
applications.
Thanks Regards,
Ramesh.Narasingu
On Tue, Sep 11, 2012 at 2:28 PM, Tony Burton tbur...@sportingindex.comwrote:
Hi,
** **
I’ve checked out the hadoop trunk, and I’m running “mvn test” on the
The namenode will asynchronously replicate the blocks to other datanodes in
order to maintain the replication factor after a datanode has not been in
contact for 10 minutes.
What happens when the datanode rejoins after namenode has already re-replicated
the blocs it was managing?
Will
Hi,
Inline.
On Tue, Sep 11, 2012 at 2:36 PM, Mehul Choube mehul_cho...@symantec.com wrote:
The namenode will asynchronously replicate the blocks to other datanodes
in order to maintain the replication factor after a datanode has not been in
contact for 10 minutes.
What happens when the
DataNode rejoins take care of only NameNode.
Sorry didn't get this
From: Narasingu Ramesh [mailto:ramesh.narasi...@gmail.com]
Sent: Tuesday, September 11, 2012 2:38 PM
To: user@hadoop.apache.org
Subject: Re: what happens when a datanode rejoins?
Hi Mehul,
DataNode rejoins take care
Ha, good sleuthing.
I just moved it to INFRA, as no one from our side has gotten to this
yet. I guess we can only moderate, not administrate. So the ticket now
awaits action from INFRA on ejecting it out.
On Tue, Sep 11, 2012 at 2:34 PM, Tony Burton tbur...@sportingindex.com wrote:
Thanks
Hi Ramesh
Thanks for the quick reply, but I'm having trouble following your English. Are
you saying that there is one command to build everything? If so, can you tell
me what it is?
Tony
From: Narasingu Ramesh [mailto:ramesh.narasi...@gmail.com]
Sent: 11 September 2012 10:06
To:
And done. We shouldn't get this anymore. Thanks for bumping on this issue Tony!
On Tue, Sep 11, 2012 at 2:44 PM, Harsh J ha...@cloudera.com wrote:
Ha, good sleuthing.
I just moved it to INFRA, as no one from our side has gotten to this
yet. I guess we can only moderate, not administrate. So
It's probably some maven thing -in particular Maven's habit of grabbing the
online nightly snapshots off apache rather than local,
try mvn clean install -DskipTests -offline
to force in all the artifacts, then run the MR tests
Tony -why not get on the mapreduce-dev mailing list, as this is the
Thanks Steve, I’ll try the mvn command you suggest. All the snapshots I can see
came from repository.apache.org though.
How do I run the MR tests only?
Thanks for the mapreduce-dev mailing list suggestion, I thought all lists had
merged into one though – did I get the wrong end of the stick?
No problem! I'll remove that Outlook filter now... :)
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: 11 September 2012 10:34
To: user@hadoop.apache.org
Subject: Re: Undeliverable messages
And done. We shouldn't get this anymore. Thanks for bumping on this issue Tony!
Good suggestions Harsh and Hemanth.
When I was asked to submit a patch for hadoop 1.0.3, I thought it a good
exercise to work through the build process to become familiar even though the
patch is documentation-only. Maybe the requests for patches could come with a
list of suggested reading as
Hi Lin
The default value for number of reducers is 1
namemapred.reduce.tasks/name
value1/value
It is not determined by data volume. You need to specify the number of
reducers for your mapreduce jobs as per your data volume.
Regards
Bejoy KS
On Tue, Sep 11, 2012 at 4:53 PM, Jason Yang
Dear Madam,
I'am keeping as attachment relevant screen shots of running hive
Profile.png shows contents of /etc/profile
hive_common_lib.png shows h-ve_common*.jar is already in $HIVE_HOME/lib , here
$HIVE_HOME is /home/yahoo/hive/build/dist as evident from classpath_err.png
Yours Truly
G
Hi Lin
The default values for all the properties are in
core-default.xml
hdfs-default.xml and
mapred-default.xml
Regards
Bejoy KS
On Tue, Sep 11, 2012 at 5:06 PM, Jason Yang lin.yang.ja...@gmail.comwrote:
Hi, Bejoy
Thanks for you reply.
where could I find the default value of
Hi Yaron
Sqoop uses a similar implementation. You can get some details there.
Replies inline
• (more general question) Are there many use-cases for using DBInputFormat? Do
most Hadoop jobs take their input from files or DBs?
From my small experience Most MR jobs have data in hdfs. It is
Hi,
I want to make sure my understanding about task assignment in hadoop
is correct or not.
When scanning a file with multiple tasktrackers,
I am wondering how a task is assigned to each tasktracker .
Is it based on the block sequence or data locality ?
Let me explain my question by example.
Hi,
Task assignment takes data locality into account first and not block
sequence. In hadoop, tasktrackers ask the jobtracker to be assigned tasks.
When such a request comes to the jobtracker, it will try to look for an
unassigned task which needs data that is close to the tasktracker and will
Another mvn test caused the build to fail slightly further down the road. As
my Jira issue is documentation-only, I've submitted the patch anyway.
Is this multiple-failure scenario typical for trying to build hadoop from the
trunk? It's sure putting me off submitting code in future. Is there
Could you please review your configuration to see if you are pointing to
the right namenode address ? (This will be in core-site.xml)
Please paste it here so we can look for clues.
Thanks
hemanth
On Tue, Sep 11, 2012 at 9:25 PM, yogesh dhari yogeshdh...@live.com wrote:
Hi all,
I am running
Yogesh
try this
hadoop fsck -Ddfs.http.address=localhost:50070 /
50070 is the default http port that the namenode runs on. The property
dfs.http.address should be set in your hdfs-site.xml
--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/
On Sep 11, 2012, at 9:03 AM, yogesh dhari
Hi, All
I need to setup a hadoop/hdfs cluster with one namenode on a machine and
two datanodes on two other machines. But after setting datanode machiines
in conf/slaves file, running bin/start-dfs.sh can not start hdfs normally..
I am aware that I have not specify the root directory hadoop is
How security is maintained in hadoop, is it maintained by giving
folder/file permissions in hadoop
how can i make sure that somebody else dunt write in to my hdfs file system
...
Hello all,
I am not getting the clear way out to remove datanode from the cluster.
please explain me decommissioning steps with example.
like how to creating exclude files and other steps involved in it.
Thanks regards
Yogesh Kumar
By reading the documentation, like the following
http://hadoop.apache.org/docs/r1.0.3/hdfs_permissions_guide.html
On Tue, Sep 11, 2012 at 8:14 PM, nisha nishakulkarn...@gmail.com wrote:
How security is maintained in hadoop, is it maintained by giving
folder/file permissions in hadoop
how can
Hi Yogesh
The detailed steps are available in hadoop wiki on FAQ page
http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F
Regrads
Bejoy KS
On Wed, Sep 12, 2012 at 12:14 AM, yogesh dhari
I figured out the cause.
HDFS block size is 128MB, but
I specify mapred.min.split.size as 512MB,
and data local I/O processing goes wrong for some reason.
When I remove the mapred.min.split.size configuration,
tasktrackers pick data-local tasks.
Why does it happen ?
It seems like a bug.
Split is
Here's one...
Write a Java program which can be accessed on the server side to pull the
picture from HDFS and display it on your JSP.
On Sep 11, 2012, at 3:48 PM, Visioner Sadak visioner.sa...@gmail.com wrote:
any hints experts atleast if i m on the right track or we cant use hftp at
all
Hi,
I have a sequence file written by SequenceFileOutputFormat with key/value
type of Text, BytesWritable, like below:
Text BytesWritable
-
id_A_01 7F2B3C687F2B3C687F2B3C68
id_A_02
Hi,
You have to specify the reducer key out type as NullWritable.
Cheers!
Manoj.
On Wed, Sep 12, 2012 at 7:43 AM, Nataraj Rashmi - rnatar
rashmi.nata...@acxiom.com wrote:
Hello,
** **
I have simple map/reduce program to merge input files into one big output
files. My question is,
Thanks Bejoy,
I try to implement and if face any issues will let you know.
Thanks
Stuti
From: Bejoy Ks [mailto:bejoy.had...@gmail.com]
Sent: Tuesday, September 11, 2012 8:39 PM
To: user@hadoop.apache.org
Subject: Re: Issue in access static object in MapReduce
Hi Stuti
You can pass the json
Hi Yogesh..
FYI. Please go through following..
http://tech.zhenhua.info/2011/04/how-to-decommission-nodesblacklist.html
http://hadoop-karma.blogspot.in/2011/01/hadoop-cookbook-how-to-decommission.html
From: yogesh dhari [yogeshdh...@live.com]
Sent: Wednesday,
Have you looked at Terracotta or any other distributed caching system?
Kunal
-- Sent while mobile --
On Sep 11, 2012, at 9:30 PM, Stuti Awasthi stutiawas...@hcl.com wrote:
Thanks Bejoy,
I try to implement and if face any issues will let you know.
Thanks
Stuti
From: Bejoy Ks
If the file is pre-sorted, why not just make multiple sequence files -
1 for each split?
Then you don't have to compute InputSplits because the physical files
are already split.
On Tue, Sep 11, 2012 at 11:00 PM, Harsh J ha...@cloudera.com wrote:
Hey Jason,
Is the file pre-sorted? You could
Hi,
I tried a similar experiment as yours but couldn't replicate the issue.
I generated 64 MB files and added them to my DFS - one file from every
machine, with a replication factor of 1, like you did. My block size was
64MB. I verified the blocks were located on the same machine as where I
45 matches
Mail list logo