Mike,
The mapred.* API has been undeprecated and continues to be the stable
API. In 1.0.0, the new API is/was unfinished and lacks a lot of ports
from the mapred.lib.* components. This is being addressed by
https://issues.apache.org/jira/browse/MAPREDUCE-3607 if you are
interested in backporting a
Best to use a secured Hadoop cluster [0], and/or setup appropriate
firewall rules that block traffic from other than your trusted IPs.
[0] - https://ccp.cloudera.com/display/CDHDOC/CDH3+Security+Guide
On Mon, Jan 16, 2012 at 4:33 AM, Something Something
wrote:
> Good point. Those ports may not
All monitoring browser ports.. such as
On Sun, Jan 15, 2012 at 5:00 PM, Lance Norskog wrote:
> Can you open all of the monitoring browser ports?
>
> On Sun, Jan 15, 2012 at 3:03 PM, Something Something
> wrote:
> > Good point. Those ports may not be open. So next question - is it safe
> t
Can you open all of the monitoring browser ports?
On Sun, Jan 15, 2012 at 3:03 PM, Something Something
wrote:
> Good point. Those ports may not be open. So next question - is it safe to
> open these ports? How do we securely open these ports to avoid malicious
> attacks under EC2?
>
> (Sorry,
Good point. Those ports may not be open. So next question - is it safe to
open these ports? How do we securely open these ports to avoid malicious
attacks under EC2?
(Sorry, I know some of these questions are dumb - but we are a startup and
don't have a big sysadmin group - I guess that's why w
Yes, I did look at CompositeInputFormat. That is why I remarked that I
suppose that I should be looking under org.apache.hadoop.mapreduce.* and
sent the earlier question about why CompositeInputFormat is not under
org.apache.hadoop.mapreduce.* in Hadoop 1.0.0. But I have gotten no
answers yet
Something Something,
Have you confirmed you can connect to the port from your remote machine?
telnet ec2-xx 9000
Kindest regards.
Ron
On Sun, Jan 15, 2012 at 12:16 AM, Something Something <
mailinglist...@gmail.com> wrote:
> Hello,
>
> Our Hadoop cluster is setup on EC2, but our clien
Hi Mark
Have a look at CompositeInputFormat. I guess it is what you are
looking for to achieve map side joins. If you are fine with a Reduce side
join go in with MultipleInputFormat. I have tried the same sort of joins
using MultipleInputFormat and have scribbled something on the same.
BTW, each key appears exactly once in the large constant dataset, and
exactly once in each MR job's output.
I am thinking the right approach is to consistently partition the job
output and the large constant dataset, with the number of partitions being
the number of reduce tasks; each part goes
I have a problem that needs to be solved by an iteration of MapReduce
jobs, and in each iteration I need to start by doing an equijoin between a
large constant dataset and the output of the previous iteration; the
remainder of my map function works on a joined tuple in a way whose
details are n
10 matches
Mail list logo