Hello Jim,
Row locks do not apply to reads, only updates. They prevent two applications
from updating the same row simultaneously. There is no other locking mechanism
in HBase. (It follows Bigtable in this regard. See
http://labs.google.com/papers/bigtable.html )
Thank you for
Owen O'Malley wrote:
On Apr 16, 2008, at 8:28 AM, Chaman Singh Verma wrote:
I am developing one application with MapReduce and in that whenever some
MapTask condition is
met, I would like to broadcast to all other MapTask to abort their
work. I
am not quite sure whether
such broadcasting
H Cagdas,
simply adjust your log4.properties (needs to be in CLASSPATH of your
DFCClient app):
log4j.logger.org.apache.hadoop=DEBUG
Cu on the 'net,
Bye - bye,
André èrbnA
Cagdas Gerede wrote:
How do you set DFSClient's log to
Good Day,
I successfully installed and copy a test file to HDFS. I was wondering if is
it possible to directly access the file without getting it out first from
the HDFS.
Regards,
Garri
What do you mean by 'directly access the file'? HDFS provides several
file operations. Type '${PATH_TO_HADOOP_INSTALL}/bin/hadoop fs' to see
an appropriate usage message.
Regards,
Thomas
Garri Santos schrieb:
Good Day,
I successfully installed and copy a test file to HDFS. I was wondering
One more thing:::
The HashMap that I am generating in the reduce phase will be on single node
or multiple nodes in the distributed enviornment? If my dataset is large
will this approach work? If not what can I do for this?
Also same thing with the file that I am writing in the run function (simple
Are the videos and slides available now?
- Original Message -
From: Jeremy Zawodny [EMAIL PROTECTED]
To: core-user@hadoop.apache.org
Cc: [EMAIL PROTECTED]
Sent: Thursday, March 27, 2008 11:01 AM
Subject: Re: Hadoop summit video capture?
Slides and video go up next week. It just
Hi,
Will Hadoop ever interleave multiple maps/reduces from different jobs
on the same tasktracker?
Suppose I have 2 jobs submitted to a jobtracker, one after the other.
Must all maps/reduces from the first submitted job be completed before
the tasktrackers will run any of the maps/reduces from
My latest problem is ::
I can not always rely on writing HashMap to file like this::
FileOutputStream fout = new FileOutputStream(f);
ObjectOutputStream objStream = new ObjectOutputStream(fout);
objStream.writeObject(HashMap);
This writing I am doing in the same run() of the outer class. The
hi devraj,
so, i researched the topic with the counters further with some success.
for one i can reproduce it now with a Test.
i am waiting for the password for my JIRA account to get started there -
somehow i didnt get the password after registration, i sent a mail to
owen.
i am not familiar
On Thu, Apr 17, 2008 at 2:41 AM, Nate Carlson [EMAIL PROTECTED] wrote:
I'm setting up a hadoop cluster across two data centers (with gig bandwidth
between them).. I'd like to use the rack awareness features to help Hadoop
know which nodes are local.. I see that it's possible, but haven't found
I am not quite sure what you mean by this.
If you mean that the second approach is only an approximation, then you are
correct.
The only simple correct algorithm that I know of is to do the counts
(correctly) and then do the main show (processing with a kill list).
On 4/16/08 9:04 PM, Amar
Yes -- you need to set your DynDNS record to point at the host (IP address)
printed after the text Master is That's the master node of your EC2
cluster. The EC2UI Firefox plugin can be useful here to verify
(independently) that EC2 instances have started correctly.
Norbert
On Wed, Apr 16,
You can also get to the file via HTTP.
On 4/17/08 2:43 AM, Thomas Thevis [EMAIL PROTECTED] wrote:
What do you mean by 'directly access the file'? HDFS provides several
file operations. Type '${PATH_TO_HADOOP_INSTALL}/bin/hadoop fs' to see
an appropriate usage message.
Regards,
Thomas
Don't assume that any variables are shared between reducers or between maps,
or between maps and reducers.
If you want to share data, put it into HDFS.
On 4/17/08 4:01 AM, Aayush Garg [EMAIL PROTECTED] wrote:
One more thing:::
The HashMap that I am generating in the reduce phase will be on
You need to create a DynDNS account and then add host records to this
account.
On Thu, Apr 17, 2008 at 12:03 PM, Prerna Manaktala
[EMAIL PROTECTED] wrote:
Hi
How do we authenticate on dyndns site?
I tried d dyndns query tool but failed when I entered
prerna.dyndns.org and class as key
Thanks! Will take a look at the jira issue 3267
_
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 17, 2008 7:09 PM
To: core-user@hadoop.apache.org
Subject: RE: Counters giving double values
hi devraj,
so, i researched the topic with the counters further
already done that..but to do ssh to that particular host shouldnt some
key be generated on dyndns?
since I am not able to do ssh to that host
On Thu, Apr 17, 2008 at 12:17 PM, Norbert Burger
[EMAIL PROTECTED] wrote:
You need to create a DynDNS account and then add host records to this
account.
The 'hadoop-ec2 run' script will ssh you automatically into your master node
after you complete the launch process.
If you're manually ssh'ing into the host, you need to specify the private
key generated during the EC2 signup process. For command-line ssh (inside
Cygwin), use the -i argument
The DFSClient caches small packets (e.g. 64K write buffers) and they are
lazily flushed to the datanoeds in the pipeline. So, when an application
completes a out.write() call, it is definitely not guaranteed that data
is sent to even one datanode.
One option would be to retrieve cache hints
I tried it:
ssh -i path to id_rsa 128.205.234.20(the path is correct)
but again the same result of:
ssh:connect to host 128.205.234.20 port 22:connection refused
I tried disabling firewall as well.
On Thu, Apr 17, 2008 at 12:56 PM, Norbert Burger
[EMAIL PROTECTED] wrote:
The 'hadoop-ec2 run'
Current structure of my program is::
Upper class{
class Reduce{
reduce function(K1,V1,K2,V2){
// I count the frequency for each key
// Add output in HashMap(Key,value) instead of output.collect()
}
}
void run()
{
runjob();
// Now eliminate top frequency keys in
Ted Dunning wrote:
I am not quite sure what you mean by this.
If you mean that the second approach is only an approximation, then you are
correct.
Yes.
The only simple correct algorithm that I know of is to do the counts
(correctly) and then do the main show (processing with a kill
If I try to specify the ID and Secret as part of the S3 URL, I get the
following error:
[EMAIL PROTECTED]:~# hadoop distcp /dijkstra.log
s3://1W27ZBE2AKDVVFZB9T02:[EMAIL PROTECTED]/
With failures, global counters are inaccurate; consider running with -i
Copy failed:
thanks
Already installed.
but still ssh to the newly created host in dyndns site is not working
ssh -i path to id_rsa prerna.dyndns.org..
it says connection to port 22 refused
Prerna
On Wed, Apr 16, 2008 at 9:17 PM, Edward J. Yoon [EMAIL PROTECTED] wrote:
I didn't try to run on cygwin, but you
[EMAIL PROTECTED] ~
$ ssh
usage: ssh [-1246AaCfgKkMNnqsTtVvXxY] [-b bind_address] [-c cipher_spec]
[-D [bind_address:]port] [-e escape_char] [-F configfile]
[-i identity_file] [-L [bind_address:]port:host:hostport]
[-l login_name] [-m mac_spec] [-O ctl_cmd] [-o
how to do that?
On Thu, Apr 17, 2008 at 7:35 PM, Edward J. Yoon [EMAIL PROTECTED] wrote:
Your ISP may be blocking access to critical ports behind their
routers. Can you access any ports on your router? You may want to try
setting up your router to forward some other port (80?) to your
It wasn't too difficult though I do not remember the details.
Maybe searching the WWW does reveal the details
http://www.google.com/search?q=dyndns+and+ssh
-Edward
On Fri, Apr 18, 2008 at 8:43 AM, Prerna Manaktala
[EMAIL PROTECTED] wrote:
how to do that?
On Thu, Apr 17, 2008 at 7:35 PM,
It took me three days and three times as many short attempts to get it
test sources to compile. It was all about sitting down and reading the
Ant build to figure out RecInt, RecString, et c, that they have to be
built using target generate-test-records and then add the output in
build/ to my
hey
thanks but I am still not able to figure out.
Please help
On Thu, Apr 17, 2008 at 8:00 PM, Edward J. Yoon [EMAIL PROTECTED] wrote:
It wasn't too difficult though I do not remember the details.
Maybe searching the WWW does reveal the details
http://www.google.com/search?q=dyndns+and+ssh
Hadoop has enormous startup costs that are relatively inherent in the
current design.
Most notably, mappers and reducers are executed in a standalone JVM
(ostensibly for safety reasons).
On 4/17/08 6:00 PM, Karl Wettin [EMAIL PROTECTED] wrote:
Is it possible to execute a job more than once?
Ted Dunning skrev:
Hadoop has enormous startup costs that are relatively inherent in the
current design.
Most notably, mappers and reducers are executed in a standalone JVM
(ostensibly for safety reasons).
Is it possible to hack in support to reuse JVMs? Keep it alive until
timed out and
Hi --
Not really sure that JVM startup is the main overhead -- you could take a
look at the logfiles of the individual TIPs and compare the timestamp of the
first log message to the time the jobtracker reports that TIP was started.
In my experience, that is well under a second (once the cluster
33 matches
Mail list logo