On Fri, Nov 11, 2011 at 10:15 AM, Matt Foley wrote:
> Nope; hot swap :-)
AFAIK you can't re-add the marked-dead disk to the DN, can you?
But yea, you can hot-swap the disk, then kick the DN process, which
should take less than 10 minutes. That means the NN won't ever notice
it's down, and you wo
Nope; hot swap :-)
On Nov 11, 2011, at 9:59 AM, Steve Ed wrote:
I understand that with 0.20.204, loss of a disk doesn’t loss the node.
But if we have to replace that lost disk, its again scheduling the whole
node down, kicking replication
*From:* Matt Foley [mailto:mfo...@hortonworks.com]
*
I understand that with 0.20.204, loss of a disk doesn't loss the node. But
if we have to replace that lost disk, its again scheduling the whole node
down, kicking replication
From: Matt Foley [mailto:mfo...@hortonworks.com]
Sent: Friday, November 11, 2011 1:58 AM
To: hdfs-user@hadoop.apache.o
Matt,
Thanks for pointing that out. I was talking about machine chassis failure
since it is the more serious case, but should have pointed out that losing
single disks is subject to the same logic with smaller amounts of data.
If, however, an installation uses RAID-0 for higher read speed then a
Another factor to consider, when disk is bad you may have corrupted blocks
which may only get detected by the periodic DataBlockScanner check.
I believe each datanode tries to finish the entire scan in
dfs.datanode.scan.period.hours (3weeks default) period.
So with 2x replication and some undetec
Thanks Harsh !...
2011/11/11 Harsh J
> Sorry Bejoy, I'd typed that URL out from what I remembered on my mind.
> Fixed link is: http://wiki.apache.org/hadoop/HadoopMapReduce
>
> 2011/11/11 Bejoy KS :
> > Thanks Harsh for correcting me with that wonderful piece of information .
> > Cleared a wrong
As Todd said, HDFS isn't suited to this. You could take a look at Gluster
though. It seems like it would fit your needs better.
-Ivan
Sorry Bejoy, I'd typed that URL out from what I remembered on my mind.
Fixed link is: http://wiki.apache.org/hadoop/HadoopMapReduce
2011/11/11 Bejoy KS :
> Thanks Harsh for correcting me with that wonderful piece of information .
> Cleared a wrong assumption on hdfs storage fundamentals today.
>
>
Thanks Bejoy, that help a lot!
2011/11/11, Bejoy KS :
> Hi Donal
> I don't have much of an expose to the domain which you are
> pointing on to, but from a plain map reduce developer terms there would be
> my way of looking into processing such data format with map reduce
> - If the data i
Hi Donal
I don't have much of an expose to the domain which you are
pointing on to, but from a plain map reduce developer terms there would be
my way of looking into processing such data format with map reduce
- If the data is kind of flowing in continuously then I'd use flume to
collect t
Thanks Harsh for correcting me with that wonderful piece of information .
Cleared a wrong assumption on hdfs storage fundamentals today.
Sorry Donal for confusing you over the same.
Harsh,
Looks like the link is broken, it'd be great if you could post the
url once more.
Thanks a lot
Rega
Hi,
Please also feel free to contact me. I'm working with STAR project at
Brookhaven Lab, and we are trying to build a MR workflow for analysis of
particle data. I've done some preliminary experiments running Root and other
nuclear physics analysis software in MR and have been looking at various
hi Steve,
What's your HDFS release version? From the error log and HDFS0.21
code, I guess that the file does not have any replicas. You may focus on
the missing replica of this file.
Pay attention to the NameNode log with that block id and track the
replica distribution. Or check the Na
Hi Donal-
On Fri, Nov 11, 2011 at 10:12:44PM +0800, ?? wrote:
> My scenario is that I have lots of files from High Energy Physics experiment.
> These files are in binary format,about 2G each, but basically they are
> composed by lots of "Event", each Event is independent with others. The
> phy
Hi Bejoy,
I don't understand why it's impossible to have half of a line in one block,
since the file is split into fixed size of blocks.
My scenario is that I have lots of files from High Energy Physics
experiment.
These files are in binary format,about 2G each, but basically they are
composed by
Bejoy,
This is incorrect. As Denny had explained earlier, blocks are split along byte
sizes alone. The writer does not concern itself with newlines and such. When
reading, the record readers align themselves to read till the end of lines by
communicating with the next block if they have to.
Th
Donal
In hadoop that hardly happens so. When you are storing data in hdfs it
would be split line to blocks depending on end of lines, in case of normal
files. It won't be like you'd be having half of a line in one block and the
rest in next one. You don't need to worry on that fact.
Thanks Bejoy!
It's better to process the data blocks locally and separately.
I just want to know how to deal with a structure (i.e. a word,a line) that
is split into two blocks.
Cheers,
Donal
在 2011年11月11日 下午7:01,Bejoy KS 写道:
> Hi Donal
> You can configure your map tasks the way you like t
Hi Donal
You can configure your map tasks the way you like to process your
input. If you have file of size 100 mb, it would be divided into two input
blocks and stored in hdfs ( if your dfs.block.size is default 64 Mb). It is
your choice on how you process the same using map reduce
- With th
Thanks Denny!
So that means each map task will have to read from another DataNode inorder
to read the end line of the previous block?
Cheers,
Donal
2011/11/11 Denny Ye
> hi
>Structured data is always being split into different blocks, likes a
> word or line.
>MapReduce task read HDFS da
I agree with Ted's argument that 3x replication is way better than 2x. But
I do have to point out that, since 0.20.204, the loss of a disk no longer
causes the loss of a whole node (thankfully!) unless it's the system disk.
So in the example given, if you estimate a disk failure every 2 hours,
ea
hi
Structured data is always being split into different blocks, likes a
word or line.
MapReduce task read HDFS data with the unit - *line* - it will read the
whole line from the end of previous block to start of subsequent to obtains
that part of line record. So you does not worry about the I
22 matches
Mail list logo