Hi all,
I have a program need to use two reduce fucntions, who can tell me why?
Thank you!
Qiang
[EMAIL PROTECTED] wrote:
I read the docs about rack awareness but my issue is how the client can
pick some specific datanodes, which are located in some specific rack,
to write the block there. The idea is that the client is able to write
the block in two separated groups of datanodes in the same
Check the namenode log. It is possible that your NFS mount has problems
NameNode might be stuck trying to write to it.
If log is not useful, you can attach jstack output for NameNode when it
seems to be stuck.
Raghu.
Nathan Wang wrote:
Hi,
We're having problems when trying to deal with t
Hi,
We're having problems when trying to deal with the namenode failover, by
following the wiki
http://wiki.apache.org/hadoop/NameNodeFailover
If we point dfs.name.dir to 2 local directories, it works fine.
But, if one of the directories is NFS mounted, we're having these problems:
1) "hadoo
On 2/22/08 3:58 PM, "Jason Venner" <[EMAIL PROTECTED]> wrote:
> We have been unable to get torque up and running. The magic value in the
> server_name file seems to elude us.
The server_name should be the real hostname of the machine running
pbs_server.
> We have tried localhost, 127.0.0.1, m
There is a package for joining data from multiple sources:
contrib/data-join.
It implements the basic joining logic and allows the user to provide
application specific logic for filtering/projecting and combining
multiple records into one.
Runping
> -Original Message-
> From: Ted Dun
I agree, I love to be part of this but the rooms are full.
Xavier
-Original Message-
From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Sent: Friday, February 22, 2008 11:04 AM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop summit / workshop at Yahoo!
Puhh, 2 days and it is full?
Does
I read the docs about rack awareness but my issue is how the client can
pick some specific datanodes, which are located in some specific rack,
to write the block there. The idea is that the client is able to write
the block in two separated groups of datanodes in the same hdfs. For
instance: bin/ha
We have been unable to get torque up and running. The magic value in the
server_name file seems to elude us.
We have tried localhost, 127.0.0.1, machine name, machine ip, fq machine
name. Depending on what we use, we either get
Unauthorized request or invalid entry
qmgr obj= svr=default: Bad ACL
Joins are easy.
Just reduce on a key composed of the stuff you want to join on. If the data
you are joining is disparate, leave some kind of hint about what kind of
record you have.
The reducer will be iterating through sets of records that have the same
key. This is similar to the results of
If your file system metadata is in /tmp, then you are likely to see
these kinds of problems. It would be nice if you can move the location
of your metadata files away from /tmp. If you still see the problem, can
you pl send us the logs from the log directory?
Thanks a bunch,
Dhruba
-Original
Raghu Angadi wrote:
Please report such problems if you think it was because of HDFS, as
opposed to some hardware or disk failures.
Will do. I suspect it's something else. I'm testing on a notebook in
pseudo-distributed
mode (per the quick start guide). My IP changes when I take that box be
Please report such problems if you think it was because of HDFS, as
opposed to some hardware or disk failures.
Raghu.
Steve Sapovits wrote:
dhruba Borthakur wrote:
Reformatting should never be necessary if you are using released version
of hadoop. Hadoop-2783 refers to a bug that got intro
dhruba Borthakur wrote:
Reformatting should never be necessary if you are using released version
of hadoop. Hadoop-2783 refers to a bug that got introduced into trunk
(not in any released versions).
Interesting. We're running only released versions. We have cases where the
name node won't co
Reformatting should never be necessary if you are using released version
of hadoop. Hadoop-2783 refers to a bug that got introduced into trunk
(not in any released versions).
Thanks,
Dhruba
-Original Message-
From: Steve Sapovits [mailto:[EMAIL PROTECTED]
Sent: Friday, February 22, 2008
What are the situations that make reformatting necessary? Testing, we seem
to hit a lot of cases where we have to reformat. We're wondering how much of
a real production issue this is.
--
Steve Sapovits
Invite Media - http://www.invitemedia.com
[EMAIL PROTECTED]
André,
You can try to rollback.
You did use upgrade when you switched to the new trunk, right?
--Konstantin
Raghu Angadi wrote:
André Martin wrote:
Hi Raghu,
done: https://issues.apache.org/jira/browse/HADOOP-2873
Subsequent tries did not succeed - so it looks like I need to
re-format the clu
Yes, I was really looking forward to attending. :)
On Fri, Feb 22, 2008 at 11:04 AM, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> Puhh, 2 days and it is full?
> Does Yahoo have no bigger rooms than just for a 100 people?
>
>
>
>
>
> On Feb 20, 2008, at 12:10 PM, Ajay Anand wrote:
>
> > The reg
Puhh, 2 days and it is full?
Does Yahoo have no bigger rooms than just for a 100 people?
On Feb 20, 2008, at 12:10 PM, Ajay Anand wrote:
The registration page for the Hadoop summit is now up:
http://developer.yahoo.com/hadoop/summit/
Space is limited, so please sign up early if you are inter
You could probably treat these two groups as different "racks". You can
read about rackawareness in
http://hadoop.apache.org/core/docs/r0.16.0/hdfs_user_guide.html , and
follow the links from there for more information regd how to configure etc.
Raghu.
[EMAIL PROTECTED] wrote:
Hi There,
I
Tarandeep Singh wrote:
but isn't the output of reduce step sorted ?
No, the input of reduce is sorted by key. The output of reduce is
generally produced as the input arrives, so is generally also sorted by
key, but reducers can output whatever they like.
Doug
Have you seen PIG:
http://incubator.apache.org/pig/
It generates hadoop code and is more query like, and (as far as I
remember) includes union, join, etc.
Tim
On Fri, 2008-02-22 at 09:13 -0800, Chuck Lan wrote:
> Hi,
>
> I'm currently looking into how to better scale the performance of our
> ca
See http://incubator.apache.org/pig/. Hope that helps. Not sure how joins
could be done in Hadoop.
Amar
On Fri, 22 Feb 2008, Chuck Lan wrote:
Hi,
I'm currently looking into how to better scale the performance of our
calculations involving large sets of financial data. It is currently using
a
I added this to the wiki.
Doug
Jimmy Lin wrote:
University of Maryland
http://www.umiacs.umd.edu/~jimmylin/cloud-computing/index.html
We are one of six universities participating in IBM/Google's academic
cloud computing initiative. Ongoing research and teaching efforts
include projects in ma
University of Maryland
http://www.umiacs.umd.edu/~jimmylin/cloud-computing/index.html
We are one of six universities participating in IBM/Google's academic
cloud computing initiative. Ongoing research and teaching efforts
include projects in machine translation, language modeling,
bioinformatic
On Feb 21, 2008, at 3:29 AM, Raghavendra K wrote:
Hi,
I am able to get Hadoop running and also able to compile the
libhdfs.
But when I run the hdfs_test program it is giving Segmentation Fault.
Unfortunately the documentation for using libhdfs is sparse, our
apologies.
You'll need to
On 2/21/08 10:52 AM, "Luca" <[EMAIL PROTECTED]> wrote:
> A few questions:
> - is Java6 ok for HOD?
That's what we use.
> - I have an externally running HDFS cluster, as specified in
> [gridservice-hdfs]: how do I find out the fs_port of my cluster? IS it
> something specified in the hadoop-si
Guys:
Thanks for the information...I've gotten some pretty good results twiddling
some parameters. I've also reminded myself about the pitfalls of
oversubscribing resources (like number of reducers). Here's what I learned,
written up here to hopefully help somebody later...
I set u
From the jira: (for users similarly affected):
Andre,
as a temporary hack, you can just comment out the FSImage.java:749 and
your restart should work, since these are last entries read from the
image file.
Raghu.
Raghu Angadi wrote:
André Martin wrote:
Hi Raghu,
done: https://is
Hi,
I'm currently looking into how to better scale the performance of our
calculations involving large sets of financial data. It is currently using
a series of Oracle SQL statements to perform the calculations. It seems to
me that the MapReduce algorithm may work in this scenario. However, I
b
Sorry, I'm an idiot. Following the law that says one figures it out
immediately on pestering others -- globStatus will do it.
Thanks,
Josh
On 2/22/08, Josh Snyder <[EMAIL PROTECTED]> wrote:
> Hi,
>
> In the current API documentation, FileSystem.globPaths is marked as
> deprecated. However, I c
Hi,
In the current API documentation, FileSystem.globPaths is marked as
deprecated. However, I couldn't figure out what I could use in its
place. What is the preferred alternative to globPaths?
I'm new to this list and to Hadoop, so I apologize if this is obvious
-- but grepping + skimming didn't
André Martin wrote:
Hi Raghu,
done: https://issues.apache.org/jira/browse/HADOOP-2873
Subsequent tries did not succeed - so it looks like I need to re-format
the cluster :-(
Please back up the log files and name node image files if you can before
re-format.
Raghu.
On Fri, Feb 22, 2008 at 5:46 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote:
>
> On Feb 21, 2008, at 11:01 PM, Ted Dunning wrote:
>
> >
> > But this only guarantees that the results will be sorted within each
> > reducers input. Thus, this won't result in getting the results
> > sorted by
> > t
Hi Raghu,
done: https://issues.apache.org/jira/browse/HADOOP-2873
Subsequent tries did not succeed - so it looks like I need to re-format
the cluster :-(
Cu on the 'net,
Bye - bye,
< André èrbnA >
Raghu Angadi wrote:
P
On Feb 21, 2008, at 11:01 PM, Ted Dunning wrote:
But this only guarantees that the results will be sorted within each
reducers input. Thus, this won't result in getting the results
sorted by
the reducers output value.
I thought the question was how to get the values sorted within a call
I have had exactly the same problem with using the command line to cat
files - they can take for ages, although I don't know why. Network
utilisation does not seem to be the bottleneck, though.
(Running 0.15.3)
Is the slow part of the reduce while you are waiting for the map data to
copy over to
37 matches
Mail list logo