The problem you describe occurs with NFS also.
Basically, single-site-semantics are very hard to achieve on a networked
file system.
On Mon, Feb 14, 2011 at 8:21 PM, Gokulakannan M wrote:
> I agree that HDFS doesn't strongly follow POSIX semantics. But it would
> have
> been better if this iss
I agree that HDFS doesn't strongly follow POSIX semantics. But it would have
been better if this issue is fixed.
_
From: Ted Dunning [mailto:tdunn...@maprtech.com]
Sent: Monday, February 14, 2011 10:18 PM
To: gok...@huawei.com
Cc: common-user@hadoop.apache.org; hdfs-u...@hadoop.apache
I completely agree, and I am using yours and the group's posting to define
the direction and approaches, but I am also trying every solution - and I am
beginning to do just that, the AvatarNode now.
Thank you,
Mark
On Mon, Feb 14, 2011 at 4:43 PM, M. C. Srivas wrote:
> I understand you are writ
I understand you are writing a book "Hadoop in Practice". If so, its
important that what's recommended in the book should be verified in
practice. (I mean, beyond simply posting in this newsgroup - for instance,
the recommendations on NN fail-over should be tried out first before writing
about how
Note that document purports to be from 2008 and, at best, was uploaded just
about a year ago.
That it is still pretty accurate is kind of a tribute to either the
stability of hbase or the stagnation depending on how you read it.
On Mon, Feb 14, 2011 at 12:31 PM, Mark Kerzner wrote:
> As a conclu
Thank you, M. C. Srivas, that was enormously useful. I understand it now,
but just to be complete, I have re-formulated my points according to your
comments:
- In 0.20 the Secondary NameNode performs snapshotting. Its data can be
used to recreate the HDFS if the Primary NameNode fails. The p
Hi Sandhya,
the threshold for leaving safemode automatically is configurable; it defaults
to 0.999, but you can change parameter "dfs.namenode.safemode.threshold-pct" to
a different floating-point number in your config. It is set to almost 100% by
default, on the theory that (a) if you didn't h
Hi all,
I want to know how can I check/compare mappers key and values.
Example:
My Mappers have the following to filter documents before being output:
String doc1;
if(!doc1.equals("d1"))
output.collect(new Text('#'+doc1+'#'), new
Text('#'+word1.substring(word1.indexOf(',')+1, word1
Hi Shivani,
You probably don't want to ask m45 specific questions on hadoop.apache mailing
list.
Try
% hadoop queue -showacls
It should show which queues you're allowed to submit. If it doesn't give you
any queues, you need to request one.
Koji
On 2/9/11 9:10 PM, "Shivani Rao" wrote:
Hello,
2011/2/14 Wang LiChuan[王立传] :
> So my question is this:
>
> If I return "false" in the function of "isSplitable" to tell the framework,
> it's a none-splittable file, when doing map/reduce, how many map task may I
> have?Does I have 100 map, and each one handle a file? Or do you have any
HI My Friends:
I’ has researched into hadoop and map/reduce, but before I can go on.
I have one question, and I Can’t find it in FAQ. Please consider this situation:
1. I created 100 files, each file of course is bigger than the default
64MB(such as 1G),so definitely will be spli
Please use the hbase mailing list for HBase-related questions.
Regarding your issue, we'll need more information to help you out.
Haven you checked the logs? If you see exceptions in there, did you
google them trying to figure out what's going on?
Finally, does your setup meet all the requirement
The summary is quite inaccurate.
On Mon, Feb 14, 2011 at 8:48 AM, Mark Kerzner wrote:
> Hi,
>
> is it accurate to say that
>
> - In 0.20 the Secondary NameNode acts as a cold spare; it can be used to
> recreate the HDFS if the Primary NameNode fails, but with the delay of
> minutes if not
Hi,
We are new with Hadoop, we have just configured a cluster with 3 servers and
everything is working ok except when one server goes down, the Hadoop / HDFS
continues working but the HBase stops, the queries does not return results
until we restart the HBase. The HBase configuration is copied bel
Did you get a response/solution/workaround to this problem?
I am getting the same error.
-Jairam
Hi,
is it accurate to say that
- In 0.20 the Secondary NameNode acts as a cold spare; it can be used to
recreate the HDFS if the Primary NameNode fails, but with the delay of
minutes if not hours, and there is also some data loss;
- in 0.21 there are streaming edits to a Backup Node (
HDFS definitely doesn't follow anything like POSIX file semantics.
They may be a vague inspiration for what HDFS does, but generally the
behavior of HDFS is not tightly specified. Even the unit tests have some
real surprising behavior.
On Mon, Feb 14, 2011 at 7:21 AM, Gokulakannan M wrote:
>
>
>> I think that in general, the behavior of any program reading data from an
HDFS file before hsync or close is called is pretty much undefined.
In Unix, users can parallelly read a file when another user is writing a
file. And I suppose the sync feature design is based on that.
So at any p
On Mon, Feb 14, 2011 at 11:55 AM, Matthew John
wrote:
> Hi guys,
>
> can someone send me a good documentation on Hbase (other than the
> hadoop wiki). I am also looking for a good Hbase tutorial.
>
Have you checked this: http://hbase.apache.org/book.html ?
-b
> Regards,
> Matthew
>
Steve is right, and to try and add more clarification...
> Interesting choice; the 7 core in a single CPU option is something else
> to consider. Remember also this is a moving target, what anyone says is
> valid now (Feb 2011) will be seen as quaint in two years time. Even a
> few months from
On 12/02/11 16:26, Michael Segel wrote:
All,
I'd like to clarify somethings...
First the concept is to build out a cluster of commodity hardware.
So when you do your shopping you want to get the most bang for your buck. That
is the 'sweet spot' that I'm talking about.
When you look at your E5
Hi guys,
can someone send me a good documentation on Hbase (other than the
hadoop wiki). I am also looking for a good Hbase tutorial.
Regards,
Matthew
On 10/02/11 22:25, Michael Segel wrote:
Shrinivas,
Assuming you're in the US, I'd recommend the following:
Go with 2TB 7200 SATA hard drives.
(Not sure what type of hardware you have)
What we've found is that in the data nodes, there's an optimal configuration
that balances price versus per
23 matches
Mail list logo