`hdfs fsck -files -blocks -locations` and the largest block is of
> length `1342177728`.
>
>
>
> - Is there some overhead for RPC calls? Could a block of length
> `1342177728` be resulting in the original warning log at the top of this
> post?
>
> - My understanding is that the only way
is of
length `1342177728`.
- Is there some overhead for RPC calls? Could a block of length `1342177728` be
resulting in the original warning log at the top of this post?
- My understanding is that the only way a client writing to HDFS can specify a
block size is via either `-Ddfs.blocksize
Hi, I am trying to use larger block size like 1GB, 2GB. In our case the files
are 5GB to 12GB size and we process whole file per mapper, is there any side
effects to using larger block size like 1GB, 2GB? Like HDFS stability when
doing replication?
Thanks
Sudhir
Hi, I am trying to use larger block size like 1GB, 2GB. In our case the files
are 5GB to 12GB size and we process whole file per mapper, is there any side
effects to using larger block size like 1GB, 2GB? Like HDFS stability when
doing replication?
Thanks
Sudhir
/hdfs-default.xml
>
>
>
> Regards
>
> Surendra
>
>
>
>
>
> *From:* Sidharth Kumar [mailto:sidharthkumar2...@gmail.com]
> *Sent:* 22 May 2017 19:36
> *To:* common-u...@hadoop.apache.org
> *Subject:* Hdfs default block size
>
>
>
> Hi,
>
block size
Hi,
Can you kindly tell me what is the default block size in apache hadoop 2.7.3?
Is it 64mb or 128mb?
Thanks
Sidharth
Hi,
Can you kindly tell me what is the default block size in apache hadoop
2.7.3? Is it 64mb or 128mb?
Thanks
Sidharth
There is also some discussion on that JIRA considering a checksum strategy
independent of block size. I don't think anything was ever implemented
though, and there would be some drawbacks to that approach. Sorry if this
caused confusion.
--Chris Nauroth
On 5/24/16, 9:55 AM, "D
h. The message to the user recommends either
> the -pb (preserve block size) or -skipCrc (skip checksum validation) as
> potential workarounds. The intent of that patch was not to silently
> proceed and report success when the block sizes are different, although
> there was some discus
Hello Dmitry,
To clarify, the intent of MAPREDUCE-5065 was to message the user that
using different block sizes on source and destination might cause a
failure to checksum mismatch. The message to the user recommends either
the -pb (preserve block size) or -skipCrc (skip checksum validation
or a long time. Are
>> you certain that you passed a dfs.blocksize equal to what was used in the
>> source files? Did all source files use the same block size?
>>
>
>
> No, I am sure that I use -D dfs.blocksize=DifferentThanSourceBlockSize (I
> want to change it d
? Did all source files use the same block size?
>
No, I am sure that I use -D dfs.blocksize=DifferentThanSourceBlockSize (I want
to change it during the copy).
I am not sure that all source files use the same block size (there are
thousands of them), but it is probably wrong to report error when
Hello Dmitry,
MAPREDUCE-5065 has been included in these branches for a long time. Are
you certain that you passed a dfs.blocksize equal to what was used in the
source files? Did all source files use the same block size?
--Chris Nauroth
On 5/20/16, 3:30 PM, "Dmitry Sivachenko&quo
Hello,
When I copy files with distcp and -D dfs.blocksize=XXX (hadoop-2.7.2), it fails
with
"Source and target differ in block-size" error despite MAPREDUCE-5065 was
committed 3 years ago.
Is it possible to merge this change to 2.7 / 2.8 branche
Hi,
I am running a Distcp programmatically from Hadoop cluster to another -
using Hadoop 2.7 and distcp v2. I would like to set a custom block size and
replication factor for my files. How can I achieve that ?
Thanks !
Varun
Hadoop 2.7 and distcp v2. I would like to set a custom block size and
replication factor for my files. How can I achieve that ?
Thanks !
Varun
Not sure if this feature is available. A workaround would be to update
replication factor and block size at the HDFS level and reverting the
changes after the distcp is complete. This is good for a one time copy. :-)
On Wed, Aug 19, 2015 at 12:52 PM, Ted Yu yuzhih...@gmail.com wrote:
I looked
?
Best regards,
Marko
On Wed 13 May 2015 06:17:58 AM CEST, Harshit Mathur wrote:
Hi Marko,
If your files are very small (less than the block size) then a lot of
map tasks will get executed, but as the initialization and overheads
degrades the overall performance, so it might appear that the single
map
thank you for the explanation, and how much byte each metadata will
consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
will store on RAM right?
Hello,
I'm in doubt should I specify the block size to be smaller than 64MB in case
that my mappers need to do intensive computations?
I know that it is better to have larger files, since the replication and
NameNode as a weak point, but I'm don't have that much data, but the
operations
Hi
I think metadata size is not greatly different. The problem is the number
of blocks. The block size is lesser than 64MB, more block generated with
the same file size(if 32MB then 2x more blocks).
And, yes. all metadata is in the namenode's heap memory.
Thanks.
Drake 민영근 Ph.D
kt NexR
Hi Marko,
If your files are very small (less than the block size) then a lot of map
tasks will get executed, but as the initialization and overheads degrades
the overall performance, so it might appear that the single map is
executing very fast but the overall job execution will take more time
The default HDFS block size 64 MB means, it is the maximum size of block of
data written on HDFS. So, if you write 4 MB files, they will still be
occupying only 1 block of 4 MB size, not more than that. If your file is
more than 64MB, it gets split into multiple blocks.
If you set the HDFS block
:
The default HDFS block size 64 MB means, it is the maximum size of block of
data written on HDFS. So, if you write 4 MB files, they will still be
occupying only 1 block of 4 MB size, not more than that. If your file is more
than 64MB, it gets split into multiple blocks.
If you set the HDFS block
Hi guys, I have a couple question about HDFS block size:
What if I set my HDFS block size from default 64 MB to 2 MB each block,
what will gonna happen?
I decrease the value of a block size because I want to store an image file
(jpeg, png etc) that have size about 4MB each file, what is your
Hi,
The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.
My point is that I assume this parameter in hadoop-core.xml sets the
block size for both namenode and datanode. However, the storage and
random access for metadata in nsamenode is different and suits
Hi,
The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.
My point is that I assume this parameter in hadoop-core.xml sets the
block size for both namenode and datanode. However, the storage and
random access for metadata in nsamenode is different and suits
Hi Mich,
please see the comments in your text.
2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh m...@peridale.co.uk:
Hi,
The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.
Correct, an HDFS client can overwrite the cfg-property and define a
different block
*ReplyTo: * user@hadoop.apache.org
*Subject: *Re: can block size for namenode be different from datanode
block size?
Hi Mich,
please see the comments in your text.
2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh m...@peridale.co.uk:
Hi,
The block size for HDFS is currently set to 128MB
?
Regards,
Mich
Let your email find you with BlackBerry from Vodafone
-Original Message-
From: Mirko Kämpf mirko.kae...@gmail.com
Date: Wed, 25 Mar 2015 15:20:03
To: user@hadoop.apache.orguser@hadoop.apache.org
Reply-To: user@hadoop.apache.org
Subject: Re: can block size for namenode
, 25 Mar 2015 16:08:02 +
*To: *user@hadoop.apache.orguser@hadoop.apache.org; m...@peridale.co.uk
*ReplyTo: * user@hadoop.apache.org
*Subject: *Re: can block size for namenode be different from wdatanode
block size?
Correct, let's say you run the NameNode with just 1GB of RAM.
This would
...@gmail.com
Date: Wed, 25 Mar 2015 16:08:02
To: user@hadoop.apache.orguser@hadoop.apache.org; m...@peridale.co.uk
Reply-To: user@hadoop.apache.org
Subject: Re: can block size for namenode be different from wdatanode block size?
Correct, let's say you run the NameNode with just 1GB of RAM
Hi Mich!
The block size you are referring to is used only on the datanodes. The file
that the namenode writes (fsimage OR editlog) is not chunked using this block
size.
HTHRavi
On Wednesday, March 25, 2015 8:12 AM, Dr Mich Talebzadeh
m...@peridale.co.uk wrote:
Hi,
The block
2. The block size is only relevant to DataNodes (DN). NameNode (NN)
does not use this parameter
Actually, as a configuration, its only relevant to the client. See also
http://www.quora.com/How-do-I-check-HDFS-blocksize-default-custom
Other points sound about right, except the ability to do
Thank you all for your contribution.
I have summarised the findings as below
1. The Hadoop block size is a configurable parameter dfs.block.size in
bytes . By default this is set to 134217728 bytes or 128MB
2. The block size is only relevant to DataNodes (DN). NameNode (NN) does
Hi,
I have read somewhere that default block size in Hadoop 2.4 is 256MB .
Is it correct ?
In which version default block size was 128MB ?
Thanks
Krish
://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
Cheers
On Sun, Feb 22, 2015 at 11:11 AM, Krish Donald gotomyp...@gmail.com
mailto:gotomyp...@gmail.com wrote:
Hi,
I have read somewhere that default block size in Hadoop 2.4 is 256MB .
Is it correct
As of Hadoop 2.6, default blocksize is 128 MB (look for dfs.blocksize)
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
Cheers
On Sun, Feb 22, 2015 at 11:11 AM, Krish Donald gotomyp...@gmail.com wrote:
Hi,
I have read somewhere that default block size
Hello All,
If the HDFS block size is to 128MB on the cluster and on the Client its set
to 64MB, What will be the size of the Block when its written to HDFS.?
Can any please point me to the link where I can find more information.
Thanks
Sajeeth
is not specific to this
parameter – for example, the same thing happens with*dfs.replication* and
others….
Regards,
Shahab
On Wed, Dec 10, 2014 at 3:27 PM, Sajid Syed sajid...@gmail.com wrote:
Hello All,
If the HDFS block size is to 128MB on the cluster and on the Client its
set to 64MB
Hi,
although the cluster has 128MB the client always goes with the configuration
local to it. So in this case it will use the 64MB.
Date: Wed, 10 Dec 2014 15:27:07 -0500
Subject: HDFS block size question
From: sajid...@gmail.com
To: user@hadoop.apache.org
Hello All,
If the HDFS block
Your client side was running at 14/07/24 18:35:58 INFO mapreduce.Job:
T***, But you are pasting NN log at 2014-07-24 17:39:34,255;
By the way, which version of HDFS are you using?
Regards,
*Stanley Shi,*
On Fri, Jul 25, 2014 at 10:36 AM, ch huang justlo...@gmail.com wrote:
2014-07-24
hi,maillist:
i try to copy data from my old cluster to new cluster,i get
error ,how to handle this?
14/07/24 18:35:58 INFO mapreduce.Job: Task Id :
attempt_1406182801379_0004_m_00_1, Status : FAILED
Error: java.io.IOException: File copy failed:
Would you please also past the corresponding namenode log?
Regards,
*Stanley Shi,*
On Fri, Jul 25, 2014 at 9:15 AM, ch huang justlo...@gmail.com wrote:
hi,maillist:
i try to copy data from my old cluster to new cluster,i get
error ,how to handle this?
14/07/24 18:35:58 INFO
2014-07-24 17:33:04,783 WARN
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:hdfs (auth:SIMPLE) cause:org.apache.hadoop.ipc.StandbyException:
Operation category READ is not supported in state standby
2014-07-24 17:33:05,742 WARN
+default+block+sizesource=blots=pXxPwd2Hv8sig=xbcUaJ10zOkLqOPw4u35ZTRPCVchl=ensa=Xei=gujGUs-eK4ulsASMi4GIDwved=0CFYQ6AEwAw#v=onepageq=change%20hadoop%20default%20block%20sizef=false
You can do it programmatically as well.
http://stackoverflow.com/questions/2669800/changing-the-block-size-of-a-dfs
Change the dfs.block.size in hdfs-site.xml to be the value you would like
if you want to have all new files have a different block size.
On Fri, Jan 3, 2014 at 11:37 AM, Kurt Moesky kurtmoe...@gmail.com wrote:
I see the default block size for HDFS is 64 MB, is this a value that can
be changed
Also note that the block size in recent releases is actually called
dfs.blocksize as opposed to dfs.block.size, and that you can set it per
job as well. In that scenario, just pass it as an argument to your job (e.g.
Hadoop bla -D dfs.blocksize= 134217728)
Regards
From: David Sinclair
As I am new to hdfs, I was told that the minimize block size is 64M, is it
correct?
XG
在 2014年1月4日,3:12,German Florez-Larrahondo
german...@samsung.commailto:german...@samsung.com 写道:
Also note that the block size in recent releases is actually called
“dfs.blocksize” as opposed
, Zhao, Xiaoguang
xiaoguang.z...@honeywell.com wrote:
As I am new to hdfs, I was told that the minimize block size is 64M, is it
correct?
XG
在 2014年1月4日,3:12,German Florez-Larrahondo german...@samsung.com 写道:
Also note that the block size in recent releases is actually called
“dfs.blocksize
You can change the block size of existing files with a command like
hadoop distcp -Ddfs.block.size=$[256*1024*1024] /path/to/inputdata
/path/to/inputdata-with-largeblocks.
After this command completes, you can remove the original data
From: kun yan
to decide based on
your usecase.
Regards,
Vinayakumar B
On Sep 10, 2013 9:02 AM, kun yan yankunhad...@gmail.com wrote:
Hi all
Can I modify HDFS data block size is 32MB, I know the default is 64MB
thanks
--
In the Hadoop world, I am just a novice, explore the entire Hadoop
ecosystem, I hope one
Hi all
Can I modify HDFS data block size is 32MB, I know the default is 64MB
thanks
--
In the Hadoop world, I am just a novice, explore the entire Hadoop
ecosystem, I hope one day I can contribute their own code
YanBit
yankunhad...@gmail.com
HDFS data block size is 32MB, I know the default is 64MB
thanks
--
In the Hadoop world, I am just a novice, explore the entire Hadoop
ecosystem, I hope one day I can contribute their own code
YanBit
yankunhad...@gmail.com
--
Harsh J
AM, kun yan yankunhad...@gmail.com wrote:
Hi all
Can I modify HDFS data block size is 32MB, I know the default is 64MB
thanks
--
In the Hadoop world, I am just a novice, explore the entire Hadoop
ecosystem, I hope one day I can contribute their own code
YanBit
yankunhad
is , no matter what the block size we decide , finally its
getting written to the computers HDD , which would be formatted and would
have a block size in KB's and also while writing to the FS (not HDFS) , its
not guaranteed that the blocks that we write are continuous , so there
would be disk seeks anyways
From: Rahul Bhattacharjee [mailto:rahul.rec@gmail.com]
Subject: Why big block size for HDFS.
Many places it has been written that to avoid huge no of disk seeks , we store
big blocks in HDFS , so that once we seek to the location , then there is only
data transfer rate which would
this correctly.
My question is , no matter what the block size we decide , finally its
getting written to the computers HDD , which would be formatted and would
have a block size in KB's and also while writing to the FS (not HDFS) , its
not guaranteed that the blocks that we write
[mailto:rahul.rec@gmail.com]
*Subject:* Why big block size for HDFS.
** **
Many places it has been written that to avoid huge no of disk seeks , we
store big blocks in HDFS , so that once we seek to the location , then
there is only data transfer rate which would be predominant , no more
Guys,
I understand that if not specified, default block size of HDFs is 64Mb. You can
control this value by altering dfs.block.size property and increasing to value
to 64Mb x 2 or 64Mb x 4.. Every time we make the change to this property we
must reimport the data for the changes to take effect
Hi,
Response inline.
On Tue, Nov 27, 2012 at 8:35 PM, Kartashov, Andy andy.kartas...@mpac.ca wrote:
Guys,
I understand that if not specified, default block size of HDFs is 64Mb. You
can control this value by altering dfs.block.size property and increasing to
value to 64Mb x 2 or 64Mb x 4
Thanks Harsh. I totally forgot about the locality thing.
I take it, for the best perfomance it is better leave the split size property
alone and let the framework handle the splits on the basis of the block size.
p.s. There were meant to be only 5 questions.
Rgds,
AK47
-Original Message
AM, Kartashov, Andy andy.kartas...@mpac.cawrote:
Thanks Harsh. I totally forgot about the locality thing.
I take it, for the best perfomance it is better leave the split size
property alone and let the framework handle the splits on the basis of the
block size.
p.s. There were meant
Guys,
After changing property of block size from 64 to 128Mb, will I need to
re-import data or will running hadoop balancer will resize blocks in hdfs?
Thanks,
AK
NOTICE: This e-mail message and any attachments are confidential, subject to
copyright and may be privileged. Any unauthorized use
Cheers!
From: Kai Voigt [mailto:k...@123.org]
Sent: Tuesday, November 20, 2012 11:34 AM
To: user@hadoop.apache.org
Subject: Re: block size
Hi,
Am 20.11.2012 um 17:31 schrieb Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca:
After changing property of block size from 64
Hi,
I apologize for asking a question that has probably been discussed many times
before, but I just want to be sure I understand it correctly. My question is
regarding the advantages of large block size in HDFS.
The Hadoop Definitive Guide provides comparison with regular file systems
understand, the data node stores data on a regular file system. If this is
so then how does having a bigger HDFS block size provide better seek
performance, when the data will ultimately be read from regular file system
which has much smaller block size.
Suppose that HDFS stored data in smaller
on a regular file system. If this
is
so then how does having a bigger HDFS block size provide better seek
performance, when the data will ultimately be read from regular file
system
which has much smaller block size.
Suppose that HDFS stored data in smaller blocks (64kb for example
. For HDFS, this is variable in size
since blocks can be smaller than the max size. The key problem with a
large size here is that it is relatively difficult to allow quick reading
of the file during writing. With a smaller block size, the block can be
committed in a way that the reader can read
...@gmail.com wrote:
**
Hi Anna
If you want to increase the block size of existing files. You can use a
Identity Mapper with no reducer. Set the min and max split sizes to your
requirement (512Mb). Use SequenceFileInputFormat and
SequenceFileOutputFormat for your job.
Your job should be done
Raj - I was not able to get this to work either.
On Tue, Oct 2, 2012 at 10:52 AM, Raj Vishwanathan rajv...@yahoo.com wrote:
I haven't tried it but this should also work
hadoop fs -Ddfs.block.size=NEW BLOCK SIZE -cp src dest
Raj
--
*From:* Anna Lahoud
Anna
I misunderstood your problem. I thought you wanted to change the block size of
every file. I didn' t realize that you were aggregating multiple small files
into different, albeit smaller, set of larger files of a bigger block size
to improve performance.
I think as Chris suggested you
Thank you. I will try today.
On Tue, Oct 2, 2012 at 12:23 AM, Bejoy KS bejoy.had...@gmail.com wrote:
**
Hi Anna
If you want to increase the block size of existing files. You can use a
Identity Mapper with no reducer. Set the min and max split sizes to your
requirement (512Mb). Use
I haven't tried it but this should also work
hadoop fs -Ddfs.block.size=NEW BLOCK SIZE -cp src dest
Raj
From: Anna Lahoud annalah...@gmail.com
To: user@hadoop.apache.org; bejoy.had...@gmail.com
Sent: Tuesday, October 2, 2012 7:17 AM
Subject: Re: File
and IdentityReducer.
Although that approaches a better solution, it still requires that I know
in advance how many reducers I need to get better file sizes.
I was looking at the SequenceFile.Writer constructors and noticed that
there are block size parameters that can be used. Using a writer
constructed
Hello Anna,
If I understand correctly, you have a set of multiple sequence files, each
much smaller than the desired block size, and you want to concatenate them
into a set of fewer files, each one more closely aligned to your desired
block size. Presumably, the goal is to improve throughput
Hi Anna
If you want to increase the block size of existing files. You can use a
Identity Mapper with no reducer. Set the min and max split sizes to your
requirement (512Mb). Use SequenceFileInputFormat and SequenceFileOutputFormat
for your job.
Your job should be done.
Regards
Bejoy KS
Hi,
We have a situation where all files that we have are 64 MB block size.
I want to change these files (output of a map job mainly) to 128 MB blocks.
What would be good way to do this migration from 64 mb to 128 mb block
files ?
Thanks,
Anurag Tangri
Hi Anurag,
The easiest option would be , in your map reduce job set the dfs.block.size to
128 mb
--Original Message--
From: Anurag Tangri
To: hdfs-u...@hadoop.apache.org
To: common-user@hadoop.apache.org
ReplyTo: common-user@hadoop.apache.org
Subject: change hdfs block size for file
reduce job set the dfs.block.size
to 128 mb
--Original Message--
From: Anurag Tangri
To: hdfs-u...@hadoop.apache.org
To: common-user@hadoop.apache.org
ReplyTo: common-user@hadoop.apache.org
Subject: change hdfs block size for file existing on HDFS
Sent: Jun 26, 2012 11:07
Hi,
We
Hi Anurag,
The easiest option would be , in your map reduce job set the dfs.block.size to
128 mb
--Original Message--
From: Anurag Tangri
To: hdfs-user@hadoop.apache.org
To: common-u...@hadoop.apache.org
ReplyTo: common-u...@hadoop.apache.org
Subject: change hdfs block size for file
Thanks very much for the clarification.
So, we'd i guess ideally set the block size equal to the transfer rate for
optimum results.
If seek time has to be 0.5% of transfer time would i set my block size at
200MB (higher than transfer rate)?
Conversely if seek time has to be 2% of transfer time
Have just started getting familiar with Hadoop HDFS. Reading Tom White's
book.
The book describes an example related to HDFS block size. Here's a verbatim
excerpt from the book
If the seek time is around 10 ms, and the transfer rate is 100 MB/s, then
to make the seek time 1% of the transfer
I'm new to Hadoop, and I'm trying to understand the implications of a 64M
block size in the HDFS. Is there a good reference that enumerates the
implications of this decision and its effects on files stored in the system
as well as map-reduce jobs?
Thanks.
On 29 September 2011 18:39, lessonz less...@q.com wrote:
I'm new to Hadoop, and I'm trying to understand the implications of a 64M
block size in the HDFS. Is there a good reference that enumerates the
implications of this decision and its effects on files stored in the system
as well as map
hi,
Here is some useful info:
A small file is one which is significantly smaller than the HDFS block size
(default 64MB). If you’re storing small files, then you probably have lots of
them (otherwise you wouldn’t turn to Hadoop), and the problem is that HDFS
can’t handle lots of files.
Every
Hi All:
I have lots of small files stored in HDFS. My HDFS block size is 128M. Each
file is significantly smaller than the HDFS block size. Then, I want to know
whether the small file used 128M in HDFS?
regards
2011-09-21
hao.wang
overhead by having to
track a larger number of small files. So, if you can merge files, it's
best practice to do so.
-Joey
On Tue, Sep 20, 2011 at 9:54 PM, hao.wang hao.w...@ipinyou.com wrote:
Hi All:
I have lots of small files stored in HDFS. My HDFS block size is 128M. Each
file
Hi, Joey:
Thanks for your help!
2011-09-21
hao.wang
发件人: Joey Echeverria
发送时间: 2011-09-21 10:10:54
收件人: common-user
抄送:
主题: Re: block size
HDFS blocks are stored as files in the underlying filesystem of your
datanodes. Those files do not take a fixed amount of space, so if you
Todd-
Ouch. I'm stuck with 0.21 for the near future, so I'll just write a small
app that copies a file using a different block size.
For reference, the config dir override using the following command did not
work either:
HADOOP_CONF_DIR=mycustomconf bin/hadoop dfs -put /src/path /dest/path
- Original Message -
From: Ben Clay rbc...@ncsu.edu
Date: Saturday, August 27, 2011 10:03 pm
Subject: set reduced block size for a specific file
To: hdfs-user@hadoop.apache.org
I'd like to set a lowered block size for a specific file. IE, if
HDFS is
configured to use 64mb blocks, I'd like
.
1. Copy $HADOOP_CONF_DIR or $HADOOP_HOME/conf to a dir
2. modify the hdfs-site.xml to have your new block size
3. Run the following:
HADOOP_CONF_DIR=mycustomconf hadoop dfs -put file dir
Convenient? No. Doable? Definitely.
I didn't even think of overriding the config dir. Thanks for the tip!
-Ben
-Original Message-
From: Allen Wittenauer [mailto:a...@apache.org]
Sent: Saturday, August 27, 2011 6:42 PM
To: hdfs-user@hadoop.apache.org
Cc: rbc...@ncsu.edu
Subject: Re: set reduced block size for a specific
for lack of
features/bugs.
1. Copy $HADOOP_CONF_DIR or $HADOOP_HOME/conf to a dir
2. modify the hdfs-site.xml to have your new block size
3. Run the following:
HADOOP_CONF_DIR=mycustomconf hadoop dfs -put file dir
Convenient? No. Doable? Definitely.
!
-Ben
-Original Message-
From: Allen Wittenauer [mailto:a...@apache.org]
Sent: Saturday, August 27, 2011 6:42 PM
To: hdfs-user@hadoop.apache.org
Cc: rbc...@ncsu.edu
Subject: Re: set reduced block size for a specific file
On Aug 27, 2011, at 12:42 PM, Ted Dunning wrote
Hi,
I have established a Hadoop cluster with one NameNode and two DataNodes.
Now I have a question about block size. I site the block size for 64MB.I stor
one text file (50MB) on the HDFS. Whether this text file is splited? If not,
the text file stor on which DataNodes? I user MapReduce
If 64 mb is your hdfs block size then the 50 mb file won't be splitted, would
be stored in a single block in hdfs. AFAIK which data node or rather which all
data nodes is decided by the name node. The block would be replicated and
stored, in default the replication factor is 3. So in your case
Hi all,
I wanna ask question,
Is there a reason why block size should be set to some 2^N, for some integer N
? Does it help
with block defragmentation etc. ?
Thanks in advance..
question,
Is there a reason why block size should be set to some 2^N, for some integer
N ? Does it help
with block defragmentation etc. ?
Thanks in advance..
FYI, I've added this to the FAQ since it comes up every so often.
1 - 100 of 207 matches
Mail list logo