Hi
I had a failure of one of the machines my JournalNode is running on.
I've restored that machine's setup and would like to attach her to the
existing JournalNode Quorum.
When I try to run it I get the following error:
ERROR org.apache.hadoop.security.UserGroupInformation:
Hi,
Is it possible to keep 1 Petabyte in a single data node?
If not, How much is the maximum storage for a particular data node?
Regards,
M. Jeba
I would say the hard limit is due to the OS local file system (and your
budget).
So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3
And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the
Hi Jeba,
There are other considerations too, for example, if a single node holds 1 PB
of data and if it were to die this would cause a significant amount of
traffic as NameNode arranges for new replicas to be created.
Vijay
From: Bertrand Dechoux [mailto:decho...@gmail.com]
Sent: 30
What would be the reason you would do that?
You would want to leverage distributed dataset for higher availability and
better response times.
The maximum storage depends completely on the disks capacity of your nodes and
what your OS supports. Typically I have heard of about 1-2 TB/node to
I want to use either UBUNTU or REDHAT .
I just want to know how much storage space we can allocate in a single data
node.
Is there any limitations in hadoop for storage in single node?
Regards,
Jeba
From: Pamecha, Abhishek apame...@ebay.com
To:
You should probably think about this in a more cluster fashion. A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration. In addition, you need enough RAM on your NameNode to keep track of
all of your blocks. A few nodes with a PB each would quickly drive up NN
RAM
I completely agree with everyone in the thread. Perhaps you are not
concerned much about the processing part, but it is still not a good idea.
Remember the power of Hadoop lies in the principle of divide and rule and
you are trying to go against that.
On Wednesday, January 30, 2013, Chris Embree
Can you say Centos?
:-)
Sent from a remote device. Please excuse any typos...
Mike Segel
On Jan 30, 2013, at 4:21 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:
Hi,
Also, think about the memory you will need in your DataNode to serve
all this data... I'm not sure there is any
I have seen the map input bytes counter go negative temporarily on
hadoop 1.x at the beginning of a job. It then corrects itself later in
the job and seems to be accurate. Any ideas?
http://terrier.org/docs/v2.2.1/hadoop_indexing.html
I also saw this behavior in a job output listed at the
Thanks for the tip.
The sqoop command listed in the stdout log file is:
sqoop
import
--driver
org.apache.derby.jdbc.ClientDriver
--connect
jdbc:derby://test-server:1527/mondb
Hi Nan,
When the Namenode will EXIT the safemode, you, you can assume that all
blocks ARE fully replicated. If the Namenode is still IN safemode that
mean that all blocks are NOT fully replicated.
JM
2013/1/29, Nan Zhu zhunans...@gmail.com:
So, we can assume that all blocks are fully
AVROs versioning capability might help if that could replace
SequenceFile in your workflow.
Just a thought.
-Terry
On 1/29/13 9:17 PM, David Parks wrote:
I'll consider a patch to the SequenceFile, if we could manually override the
sequence file input Key and Value that's read from the
That is correct if you do not manually exit NN safemode.
Regards
Chen
On Jan 30, 2013 8:59 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:
Hi Nan,
When the Namenode will EXIT the safemode, you, you can assume that all
blocks ARE fully replicated. If the Namenode is still IN safemode
*bin/hadoop dfsadmin -report should give you what you are looking for.*
*
*
*a node is blacklisted only if there are too many failures on a particular
node. You can clear it by restarting the particular datanode or tasktracker
service. This is for the better performance of your hadoop cluster to
NN does recalculate new replication work to do due to unavailable
replicas (under-replication) when it starts and receives all block
reports, but executes this only after out of safemode. When in
safemode, across the HDFS services, no mutations are allowed.
On Wed, Jan 30, 2013 at 8:34 AM, Nan
Hi Harsh
I have a question. How namenode gets out of safemode in condition of data
blocks lost, only administrator? Accordin to my experiences, the NN (0.21)
stayed in safemode about several days before I manually turn safemode off.
There were 2 blocks lost.
Chen
On Wed, Jan 30, 2013 at 10:27
following are the configs it looks for . Unless Admin forces it to come out
of safenode, it respects below values
dfs.namenode.safemode.threshold-pct0.999fSpecifies the percentage of blocks
that should satisfy the minimal replication requirement defined by
dfs.namenode.replication.min. Values
Yes, if there are missing blocks (i.e. all replicas lost), and the
block availability threshold is set to its default of 0.999f (99.9%
availability required), then NN will not come out of safemode
automatically. You can control this behavior by configuring
dfs.namenode.safemode.threshold.
On Wed,
Hi Team,
What is the best way to migrate data residing on one cluster to another cluster
?
Are there better methods available than distcp ?
What if both the clusters running different RPC protocol versions ?
**
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
I think Chen is asking replication lost,
so, according to Harsh's reply, in safe mode, NN will know all blocks which has
less replications than 3(by default setup) but no less than 1, and after
getting out from safe mode, it will instruct the real replicating works? Hope I
understand it
DistCp is the fastest option, letting you copy data in parallel. For
incompatible RPC versions between different HDFS clusters, the HFTP
solution can work (documented on the DistCp manual).
On Wed, Jan 30, 2013 at 10:13 PM, Siddharth Tiwari
siddharth.tiw...@live.com wrote:
Hi Team,
What is the
Well, the documentation is more explicite.
Specifies the percentage of blocks that should satisfy the minimal
replication requirement defined by* dfs.namenode.replication.min*.
Which happens to be 1 by default but doesn't need to stay that way.
Regards
Bertrand
On Wed, Jan 30, 2013 at 5:45
Yaotian,
*Oozie version?
*More details on what exactly is your workflow action (mapred, java, shell,
etc)
*What is is in the task log of the oozie laucher job for that action?
Thx
On Fri, Jan 25, 2013 at 10:43 PM, yaotian yaot...@gmail.com wrote:
I manually run it in Hadoop. It works.
But
Hemanth,
Is FS caching enabled or not in your cluster?
A simple solution would be to modify your mapper code not to close the FS.
It will go away when the task ends anyway.
Thx
On Thu, Jan 24, 2013 at 5:26 PM, Hemanth Yamijala yhema...@thoughtworks.com
wrote:
Hi,
We are noticing a
Cobert,
[Moving thread to user@oozie.a.o, BCCing common-user@hadoop.a.o]
* What version of Oozie are you using?
* Is the cluster a secure setup (Kerberos enabled)?
* Would you mind posting the complete launcher logs?
Thx
On Wed, Jan 30, 2013 at 6:14 AM, Corbett Martin comar...@nhin.com
Hi All,
I wanted to know how to connect HAdoop with MircoStrategy
Any help is very helpfull.
Witing for you response
Note: Any Url and Example will be really help full for me.
Thanks,
samir
Hi all
I we need the connectivity of SAP HANA with Hadoop,
Do you have any experience with that can you please share some documents
and example with me ,so that it will be really help full for me
thanks,
samir
We are using coludera Hadoop
On Thu, Jan 31, 2013 at 2:12 AM, samir das mohapatra
samir.help...@gmail.com wrote:
Hi All,
I wanted to know how to connect HAdoop with MircoStrategy
Any help is very helpfull.
Witing for you response
Note: Any Url and Example will be really help
The token renewer needs to be the job tracker principal. I think oozie had mr
token hardcoded at one point, but later changed it to use a conf setting.
The rest of the log looks very odd - ie. it looks like security is off, but it
can't be. It's trying to renew hdfs tokens issued for the hdfs
Hi All,
I am using HBase0.92.1. I am trying to break the HBase bulk loading into
multiple MR jobs since i want to populate more than one HBase table from a
single csv file. I have looked into MultiTableOutputFormat class but i
doesnt solve my purpose becasue it does not generates HFile.
I
Hi,
Part answer: you can get the blacklisted tasktrackers using the command
line:
mapred job -list-blacklisted-trackers.
Also, I think that a blacklisted tasktracker becomes 'unblacklisted' if it
works fine after some time. Though I am not very sure about this.
Thanks
hemanth
On Wed, Jan 30,
Hi
I have posted my question for a day,please can somebody help me to
figure out
what the problem is.
Thank you.
regards
YouPeng Yang
-- Forwarded message --
From: YouPeng Yang yypvsxf19870...@gmail.com
Date: 2013/1/30
Subject: YARN NM containers were killed
To:
FS Caching is enabled on the cluster (i.e. the default is not changed).
Our code isn't actually mapper code, but a standalone java program being
run as part of Oozie. It just seemed confusing and not a very clear
strategy to leave unclosed resources. Hence my suggestion to get an
uncached FS
I might be wrong but have you considered distcp?
On Jan 31, 2013 11:15 AM, samir das mohapatra samir.help...@gmail.com
wrote:
Hi All,
Any one knows, how to load data from one hadoop cluster(CDH4) to
another Cluster (CDH4) . They way our project needs are
1) It should be delta load or
Hi All,
My Company wanted to implement right Distribution for Apache Hadoop
for its Production as well as Dev. Can any one suggest me which one
will good for future.
Hints:
They wanted to know both pros and cons.
Regards,
samir.
thanks all.
On Thu, Jan 31, 2013 at 11:19 AM, Satbeer Lamba satbeer.la...@gmail.comwrote:
I might be wrong but have you considered distcp?
On Jan 31, 2013 11:15 AM, samir das mohapatra samir.help...@gmail.com
wrote:
Hi All,
Any one knows, how to load data from one hadoop
Hello Vikas,
It clearly shows that the class can not be found. For
debugging, you can write your MR job as a standalone java program and debug
it. It works. And if you want to just debug your mapper / reducer logic,
you should look into using MRUnit. There is a good
38 matches
Mail list logo