Yes, version 18.3 is the most stable one. It has added patches,
without not-proven new functionality.
2009/2/11 Owen O'Malley omal...@apache.org:
On Feb 10, 2009, at 7:21 PM, Vadim Zaliva wrote:
Maybe version 0.18
is better suited for production environment?
Yahoo is mostly on 0.18.3 +
You can retrieve them from the command line using
bin/hadoop job -counter job-id group-name counter-name
Tom
On Wed, Feb 11, 2009 at 12:20 AM, scruffy323 steve.mo...@gmail.com wrote:
Do you know how to access those counters programmatically after the job has
run?
S D-5 wrote:
This does
Zheng Shao wrote:
We need to implement a version of Integer.parseInt/atoi from byte[] instead of
String to avoid the high cost of creating a String object.
I wanted to take the open jdk code but the license is GPL:
http://www.docjar.com/html/api/java/lang/Integer.java.html
Does anybody know
Brian Bockelman wrote:
Just to toss out some numbers (and because our users are making
interesting numbers right now)
Here's our external network router:
http://mrtg.unl.edu/~cricket/?target=%2Frouter-interfaces%2Fborder2%2Ftengigabitethernet2_2;view=Octets
Here's the
Good morning everyone,
I have a question about correct setup for hadoop. I have 14 Dell
computers in a lab. Each connected to the internet and each
independent of each other. All run CentOS. Logins are handled by NIS.
If userA logs into the master and starts the daemons and UserB logs
The particular problem I am having is this one:
https://issues.apache.org/jira/browse/HADOOP-2669
I am observing it in version 19. Could anybody confirm that
it have been fixed in 18, as Jira claims?
I am wondering why bug fix for this problem might have been committed
to 18 branch but not 19.
Hi,
Let's say the smaller subset has name A. It is a relatively small collection
100 000 entries (could also be only 100), with nearly no payload as value.
Collection B is a big collection with 10 000 000 entries (Each key of A
also exists in the collection B), where the value for each key is
Vadim Zaliva wrote:
The particular problem I am having is this one:
https://issues.apache.org/jira/browse/HADOOP-2669
I am observing it in version 19. Could anybody confirm that
it have been fixed in 18, as Jira claims?
I am wondering why bug fix for this problem might have been committed
to
Are the keys in collection B unique?
If so, I would like to try this approach:
For each key, value of collection B, make a file out of it with file name
given by MD5 hash of the key, and value being its content, and then
store all these files into a HAR archive.
The HAR archive will create an
Hey all
I was trying to edit the file that mounted by fuse_dfs by vi editor, but the
contents could not save.
The command is like the following:
[had...@vm-centos-5-shu-4 src]$ vi /mnt/dfs/test.txt
The error message from system log (/var/log/messages) is the following:
Feb 12 09:53:48
I don't see why a HAR archive needs to be involved. You can use a MapFile to
create a scannable index over a SequenceFile and do lookups that way.
But if A is small enough to fit in RAM, then there is a much simpler way:
Write it out to a file and disseminate to all mappers via the
Hi all,
I am running a data-intensive job on 18 nodes on EC2, each with just
1.7GB of memory. The input size is 50GB, and as a result, my mapper splits
it up automatically to 786 map tasks. This runs fine. However, I am
setting the reduce task number to 18. This is where I get a java heap
Maybe you need allocate larger vm- memory to use parameter -Xmx1024m
On Thu, Feb 12, 2009 at 10:56 AM, Kris Jirapinyo kjirapi...@biz360.comwrote:
Hi all,
I am running a data-intensive job on 18 nodes on EC2, each with just
1.7GB of memory. The input size is 50GB, and as a result, my
Darn that send button.
Anyways, so I was wondering if my understanding is correct. There will only
be the exact same number of output files as the number of reducer tasks I
set. Thus, in my output directory from the reducer, I should always see
only 18 files. However, if my understanding is
I tried that, but with 1.7GB, that will not allow me to run 1 mapper and 1
reducer concurrently (as I think when you do -Xmx1024m it tries to reserve
that physical memory?). Thus, to be safe, I set it to -Xmx768m.
The error I get when I do 1024m is this:
java.io.IOException: Cannot run program
bjday wrote:
Good morning everyone,
I have a question about correct setup for hadoop. I have 14 Dell
computers in a lab. Each connected to the internet and each
independent of each other. All run CentOS. Logins are handled by
NIS. If userA logs into the master and starts the daemons
Like Amar said. Try adding
property
namedfs.permissions/name
valuefalse/value
/property
to your conf/hadoop-site.xml file (or flip the value in hadoop-default.xml),
restart your daemons and give it a whirl.
cheers,
-jw
On Wed, Feb 11, 2009 at 8:44 PM, Amar Kamat ama...@yahoo-inc.com wrote:
I have also the same problem.
It would be wonderful if someone has some info about this..
Rasit
2009/2/10 Mimi Sun m...@rapleaf.com:
I see UnsatisfiedLinkError. Also I'm calling
System.getProperty(java.library.path) in the reducer and logging it. The
only thing that prints out is
On Feb 10, 2009, at 12:24 PM, Mimi Sun wrote:
I see UnsatisfiedLinkError. Also I'm calling
System.getProperty(java.library.path) in the reducer and logging
it. The only thing that prints out is ...hadoop-0.18.2/bin/../lib/
native/Mac_OS_X-i386-32
I'm using Cascading, not sure if that
Hi, Mark
Try to add an extra property to that file, and try to examine if
hadoop recognizes it.
This way you can find out if hadoop uses your configuration file.
2009/2/10 Jeff Hammerbacher ham...@cloudera.com:
Hey Mark,
In NameNode.java, the DEFAULT_PORT specified for NameNode RPC is 8020.
20 matches
Mail list logo