System with: 1 billion small files.
Namenode will need to maintain the data-structure for all those files.
System will have atleast 1 block per file. And if u have replication
factor set to 3, the system will have 3 billion blocks.
Now , if you try to read all these files in a job , you will be
Once the nodes are listed as dead, if you still have the host names in your
conf/exclude file, remove the entries and then run hadoop dfsadmin
-refreshNodes.
This works for us on our cluster.
-paul
On Tue, Jan 27, 2009 at 5:08 PM, Bill Au wrote:
> I was able to decommission a datanode succ
I was able to decommission a datanode successfully without having to stop my
cluster. But I noticed that after a node has been decommissioned, it shows
up as a dead node in the web base interface to the namenode (ie
dfshealth.jsp). My cluster is relatively small and losing a datanode will
have pe
On Mon, Jan 26, 2009 at 5:40 PM, Vadim Zaliva wrote:
> Is it possible to obtain auto-generated IDs when writing data using
> DBOutputFormat?
>
> For example, is it possible to write Mapper which stores records in DB
> and returns auto-generated
> IDs of these records?
...
> which I would like t
You may also want to have a look at this to reach a decision based on your
needs:
http://www.swaroopch.com/notes/Distributed_Storage_Systems
Jim
On Tue, Jan 27, 2009 at 1:22 PM, Jim Twensky wrote:
> Rasit,
>
> What kind of data will you be storing on Hbase or directly on HDFS? Do you
> aim to
Rasit,
What kind of data will you be storing on Hbase or directly on HDFS? Do you
aim to use it as a data source to do some key/value lookups for small
strings/numbers or do you want to store larger files labeled with some sort
of a key and retrieve them during a map reduce run?
Jim
On Tue, Jan
Yes, I did run fsck after upgrade. No error message. Everything is "OK".
yy
Brian Bockelman
To
Yes, I did that. But there is some error message that asks me to rollback
first. So, I ended up a -rollback first and then and -upgrade.
yy
Bill Au
Perhaps what you are looking for is HBase?
http://hbase.org
HBase is a column-oriented, distributed store that sits on top of HDFS and
provides random access.
JG
> -Original Message-
> From: Rasit OZDAS [mailto:rasitoz...@gmail.com]
> Sent: Tuesday, January 27, 2009 1:20 AM
> To: core-
Tossing one more on this king of all threads:
Stuart Sierra of AltLaw wrote a nice little tool to serialize tar.bz2 files
into SequenceFile, with filename as key and its contents a BLOCK-compressed
blob.
http://stuartsierra.com/2008/04/24/a-million-little-files
flip
On Mon, Jan 26, 2009 at 3:2
Hey YY,
At a more basic level -- have you run fsck on that file? What were
the results?
Brian
On Jan 27, 2009, at 10:54 AM, Bill Au wrote:
Did you start your namenode with the -upgrade after upgrading from
0.18.1 to
0.19.0?
Bill
On Mon, Jan 26, 2009 at 8:18 PM, Yuanyuan Tian
wrote:
Did you start your namenode with the -upgrade after upgrading from 0.18.1 to
0.19.0?
Bill
On Mon, Jan 26, 2009 at 8:18 PM, Yuanyuan Tian wrote:
>
>
> Hi,
>
> I just upgraded hadoop from 0.18.1 to 0.19.0 following the instructions on
> http://wiki.apache.org/hadoop/Hadoop_Upgrade. After upgrade,
This is a little note to advise universities working on Hadoop related
projects, that they may
be able to get some money and cluster time for some fun things
http://www.hpl.hp.com/open_innovation/irp/
"The HP Labs Innovation Research Program is designed to create
opportunities -- at college
All,
I'm broadcasting this to all of the Hadoop dev and users lists,
however, in the future I'll only send cross-subproject announcements
to gene...@hadoop.apache.org. Please subscribe over there too! It is
very low traffic.
Anyways, ApacheCon Europe is coming up in March. There are a r
Is there a way to programatically get the number of records in a MapFile
without doing a complete scan?
Hi Tien,
Configuration config = new Configuration(true);
config.addResource(new Path("/etc/hadoop-0.19.0/conf/hadoop-site.xml"));
FileSystem fileSys = FileSystem.get(config);
BlockLocation[] locations = fileSys.getFileBlockLocations(.
I copied some lines of my code, it can also help if you p
Edward Capriolo wrote:
Zeroconf is more focused on simplicity then security. One of the
original problems that may have been fixes is that any program can
announce any service. IE my laptop can announce that it is the DNS for
google.com etc.
-1 to zeroconf as it is way too chatty. Every DNS lo
Edwin wrote:
Hi
I am looking for a way to interrupt a thread that entered
JobClient.runJob(). The runJob() method keep polling the JobTracker until
the job is completed. After reading the source code, I know that the
InterruptException is caught in runJob(). Thus, I can't interrupt it using
Thre
Hi
I am looking for a way to interrupt a thread that entered
JobClient.runJob(). The runJob() method keep polling the JobTracker until
the job is completed. After reading the source code, I know that the
InterruptException is caught in runJob(). Thus, I can't interrupt it using
Thread.interrupt()
Hi,
I wanted to ask, if HDFS is a good solution just as a distributed db (no
running jobs, only get and put commands)
A review says that "HDFS is not designed for low latency" and besides, it's
implemented in Java.
Do these disadvantages prevent us using it?
Or could somebody suggest a better (fast
20 matches
Mail list logo