On Thu, Jul 23, 2009 at 06:49, Palleti,
Pallavi wrote:
> Hi all,
>
>
>
> We figured out that anyone who have configured their local hadoop with
> remote cluster hadoop details and having user name as hadoop can get
> administrative rights of the cluster. For example, if I create an user
> as hadoo
This mainly happens when you do not have enough space. Please clean up and
run again.
On Tue, Jul 21, 2009 at 10:33 PM, George Pang wrote:
> Hi users,
>
> Please help with this one - I got an error at running a two - node cluster
> on big files, the error is :
>
> 2365222 [main] ERROR
> org.apac
For what it's worth, we ended up solving this problem (today) by using
EasyMock with ClassExtension. It's an awful lot of magic, but it seems
to work just fine for our purposes. It would be great if doing
bytecode weaving under the hood weren't necessary just to write test
code, though.
-- David
Does MultipleOutputFormat suffice?
Cheers!
Amogh
-Original Message-
From: Mark Kerzner [mailto:markkerz...@gmail.com]
Sent: Thursday, July 23, 2009 6:24 AM
To: core-u...@hadoop.apache.org
Subject: Output of a Reducer as a zip file?
Hi,
my output consists of a number of binary files, cor
Do not allow direct access to the hadoop cluster from untrusted machines.
Also, until further security measures are implemented, hadoop trusts the
origin machine and library to identify the user correctly. Soon there will
be a better level of authentication, but for now that is it.
This works ou
Hi all,
We figured out that anyone who have configured their local hadoop with
remote cluster hadoop details and having user name as hadoop can get
administrative rights of the cluster. For example, if I create an user
as hadoop locally in my machine and have conf directory details from the
cl
er, +CC mapreduce-dev...
- A
On Wed, Jul 22, 2009 at 8:17 PM, Aaron Kimball wrote:
> +CC mapred-dev
>
> Hm.. Making this change is actually really difficult.
>
> After changing Mapper.java, I understand why this was made a
> non-static member. By making Context non-static, it can inherit from
> M
+CC mapred-dev
Hm.. Making this change is actually really difficult.
After changing Mapper.java, I understand why this was made a
non-static member. By making Context non-static, it can inherit from
MapContext and bind to the type qualifiers already
specified in the class definition. So you can'
Both of those are good points. I'll submit a patch.
- Aaron
On Wed, Jul 22, 2009 at 6:24 PM, Ted Dunning wrote:
> To amplify David's point, why is the argument a Mapper.Context rather than
> MapContext?
>
> Also, why is the Mapper.Context not static?
>
> On Wed, Jul 22, 2009 at 5:29 PM, David Hall
Is this a property introduced by Hadoop version 0.19.0? Where can I find out
more about this?
Thanks!
Mithila
On Tue, Jul 14, 2009 at 8:16 PM, akhil1988 wrote:
>
> Thanks, Tom this was what I was looking for.
> Just to confirm it's usage - it means that upon jobtracker restart it will
> automoti
To amplify David's point, why is the argument a Mapper.Context rather than
MapContext?
Also, why is the Mapper.Context not static?
On Wed, Jul 22, 2009 at 5:29 PM, David Hall wrote:
> This is nice, but doesn't it suffer from the same problem? MRUnit uses
> the mapred API, which is deprecated, a
Hi,
my output consists of a number of binary files, corresponding text files,
and one descriptor file. Is there a way to for my reducer to produce a zip
of all binary files, another zip of all text ones, and a separate text
descriptor? If not, how close to this can I get? For example, I could code
I second Todd's recommendation. Elastic MapReduce currently doesn't have a
mechanism for users to change mapred.tasktracker.map.tasks.maximum. However, by
default we run more mappers per core than is generally recommended, because
we've found it results in better performance in the EC2/S3 enviro
This is nice, but doesn't it suffer from the same problem? MRUnit uses
the mapred API, which is deprecated, and the new API doesn't use
OutputCollector, but a non-static inner class.
-- David
On Wed, Jul 22, 2009 at 4:52 PM, Aaron Kimball wrote:
> Hi David,
>
> I wrote a contrib module called MRU
It looks like there's quite a bit more documentation about MRUnit on the
Cloudera site that's not included in the regular documentation. Looks
like about twice as much. It would be great if this could be added to
the content that's in mrunit/doc
Thanks,
Jakob
Aaron Kimball wrote:
Hi David
If there is one, it's not contributed back to the public project. My
guess is probably not :(
- Aaron
On Wed, Jul 22, 2009 at 10:00 AM, Alberto Luengo
Cabanillas wrote:
> Hi everyone! Does anybody know if there´s an Eclipse plugin for developing
> programs in C/C++ and submit them as Jobs to a had
Hi David,
I wrote a contrib module called MRUnit
(http://issues.apache.org/jira/browse/hadoop-5518) designed to allow
unit tests for mappers/reducers more easily. It's slated for inclusion
in 0.21, not 0.20 unfortunately, but you can download the patch above
as well as MAPREDUCE-680 and build it a
On Wed, Jul 22, 2009 at 4:20 PM, Hitchcock, Andrew wrote:
> We don't have hard numbers on S3 transfer rates. The cluster-wide transfer
> rate depends on a number of factors such as instance type, cluster size, and
> general network congestion.
>
Your mileage of course will vary based on the fact
We don't have hard numbers on S3 transfer rates. The cluster-wide transfer rate
depends on a number of factors such as instance type, cluster size, and general
network congestion.
I'm curious why you think S3 won't work for your use case. Would you like to
elaborate? As I described in the previ
Hi,
I'm a student working with Apache Mahout for the Google Summer of
Code. We recently moved to 0.20.0, and I was porting my code to the
new API. Unfortunately, I (and the whole project team) seem to have
run into a problem when it comes to testing them.
Historically, we would create a Mapper in
They're designed to take a few minutes and seem to in operations here
and at Yahoo. Details, of course, will vary depending on data volumes
and hardware. More benchmarks welcome. :)
--Ari
On Mon, Jul 20, 2009 at 3:04 AM, zsongbo wrote:
> Hi Ari,
>
> Thanks.
> In Chukwa, how about the performance
Hello,
I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to
launch a Hadoop cluster on Amazon EC2. To do so, I modified the bundled
scripts above for my EC2 account, and then created my own Hadoop 0.20.0
AMI. The steps I followed for creating AMIs and launching EC2 Hadoop
cluster
On Jul 22, 2009, at 8:22 AM, Rares Vernica wrote:
Hello,
I wonder how did the Yahoo! developers generate the Task Timeline
figures in their "Hadoop Sorts a Petabyte..." blog post:
The script is at:
http://people.apache.org/~omalley/tera-2009/job_history_summary.py
The input data is the job
nope, if i recall the data is randomly generated (the task itself requires
fixed-length binary strings to be sorted)
Miles
2009/7/22 Harish Mallipeddi
> On Wed, Jul 22, 2009 at 8:52 PM, Rares Vernica wrote:
>
> > Hello,
> >
> > I wonder how did the Yahoo! developers generate the Task Timeline
On Wed, Jul 22, 2009 at 8:52 PM, Rares Vernica wrote:
> Hello,
>
> I wonder how did the Yahoo! developers generate the Task Timeline
> figures in their "Hadoop Sorts a Petabyte..." blog post:
>
>
> http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petabyte_in_162.html
>
> I am intere
Hello,
I wonder how did the Yahoo! developers generate the Task Timeline
figures in their "Hadoop Sorts a Petabyte..." blog post:
http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petabyte_in_162.html
I am interested to know the following two aspects:
1. How did they collect the da
JQ Hadoop wrote:
I'm wondering where once can get the pagerank implementation for a try.
Thanks,
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/extras/citerank/
works over the citeceer citation dataset
But right now the script forcefully adds and extra -Xmx1000m even if you
don't want it..
I guess I'll be submitting a patch for hadoop-daemon.sh later. :) :)
thank you all
On 7/22/09 2:25 AM, Amogh Vasekar wrote:
I haven't played a lot with it, but you may want to check if setting
HADOOP_NA
Hi all,
In simple terms, Why is any output stream that failed to close when the
datanodes weren't available fails when I try to close the same again
when the datanodes are available? Could someone kindly help me to tackle
this situation?
Thanks
Pallavi
-Original Message-
From: Palleti, P
I'm wondering where once can get the pagerank implementation for a try.
Thanks,
-JQ
On Wed, Jul 22, 2009 at 6:14 PM, Steve Loughran wrote:
> Owen O'Malley wrote:
>
>>
>> On Jul 21, 2009, at 8:28 AM, Ted Dunning wrote:
>>
>> There are already several such efforts.
>>>
>>> Pig has PigMix
>>>
>>>
Owen O'Malley wrote:
On Jul 21, 2009, at 8:28 AM, Ted Dunning wrote:
There are already several such efforts.
Pig has PigMix
Hadoop has terasort and likely some others as well.
Hadoop has the terasort, and grid mix. There is even a new version of
the grid mix coming out. Look at:
https:/
I haven't played a lot with it, but you may want to check if setting
HADOOP_NAMENODE_OPTS, HADOOP_TASKTRACKER_OPTS help. Let me know if you find a
way to do this :)
Cheers!
Amogh
-Original Message-
From: Fernando Padilla [mailto:f...@alum.mit.edu]
Sent: Wednesday, July 22, 2009 9:47 AM
Great guys,
thank you a lot it is working now.
On Wed, Jul 22, 2009 at 3:55 AM, Aaron Kimball wrote:
> And regarding your desire to set things on the command line: If your
> program implements Tool and is launched via ToolRunner, you can
> specify "-D myparam=myvalue" on the command line and it
Hi,
2009/7/22 Mathias De Maré
> I went over the steps, and it looks like I did the same (only I didn't
> create a dedicated user and I didn't disable IPv6, since I can use it here).
> Oh, and I noticed one more thing: when I start Hadoop by running
> bin/start-dfs.sh, wait about 20 seconds for e
Hi,
On Fri, Jul 17, 2009 at 5:37 PM, Bogdan M. Maryniuk <
bogdan.maryn...@gmail.com> wrote:
> 2009/7/17 Mathias De Maré :
> > I'm using Hadoop 0.20.0 (semidistributed mode, or whatever it's called --
> I
> > can't look up the name, since the documentation on the site seems to be
> > down), and I
35 matches
Mail list logo