Hi Nitin,
It looks like you may be using the wrong port number - try 8088 for
the resource manager UI.
Cheers,
Tom
On Mon, Nov 28, 2011 at 4:02 AM, Nitin Khandelwal
wrote:
> Hi,
>
> I was trying to setup Hadoop 0.23.0 with help of
> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoo
On Thu, Oct 13, 2011 at 2:06 PM, Raimon Bosch wrote:
> By the way,
>
> The url I'm trying has a '_' in the bucket name. Could be this the problem?
Yes, underscores are not permitted in hostnames.
Cheers,
Tom
>
> 2011/10/13 Raimon Bosch
>
>> Hi,
>>
>> I've been having some problems with one of
JobConf and the old API are no longer deprecated in the forthcoming
0.20.205 release, so you can continue to use it without issue.
The equivalent in the new API is setInputFormatClass() on
org.apache.hadoop.mapreduce.Job.
Cheers,
Tom
On Tue, Oct 11, 2011 at 9:18 AM, Keith Thompson wrote:
> I se
You might consider Apache Whirr (http://whirr.apache.org/) for
bringing up Hadoop clusters on EC2.
Cheers,
Tom
On Wed, Aug 31, 2011 at 8:22 AM, Robert Evans wrote:
> Dmitry,
>
> It sounds like an interesting idea, but I have not really heard of anyone
> doing it before. It would make for a goo
Hi Witold,
Is this on Windows? The scripts were re-structured after Hadoop 0.20,
and looking at them now I notice that the cygwin path translation for
the classpath seems to be missing. You could try adding the following
line to the "if $cygwin" clause in bin/hadoop-config.sh:
CLASSPATH=`cygpat
The instructions at
http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html should be
what you need.
Cheers,
Tom
On Wed, Mar 2, 2011 at 12:59 AM, Manish Yadav wrote:
> Dear Sir/Madam
> I'm very new to hadoop. I'm trying to install hadoop on my computer. I
> followed a weblink and try to in
These files are generated files. If you run "ant avro-generate
eclipse" then Eclipse should file these files.
Cheers,
Tom
On Mon, Feb 28, 2011 at 2:43 AM, bharath vissapragada
wrote:
> Hi all,
>
> I checked out the "map-reduce" trunk a few days back and following
> files are missing..
>
> impor
Hi Steve,
Sorry to hear about the problems you had. The issue you hit was a
result of MAPREDUCE-954, and there was some discussion on that JIRA
about compatibility. I believe the thinking was that the context
classes are framework classes, so users don't extend/implement them in
the normal course
On Thu, Oct 21, 2010 at 8:23 AM, ed wrote:
> Hello,
>
> The MapRunner classes looks promising. I noticed it is in the deprecated
> mapred package but I didn't see an equivalent class in the mapreduce
> package. Is this going to ported to mapreduce or is it no longer being
> supported? Thanks!
Hi Ed,
The directory structure moved around as a result of the project
splitting into three subprojects (Common, HDFS, MapReduce). The
streaming jar is in mapred/contrib/streaming in the distribution.
Cheers,
Tom
On Mon, Oct 4, 2010 at 8:03 PM, edward choi wrote:
> Hi,
> I've recently downloade
Hi John,
This question really belongs on the Cloudera list
(http://getsatisfaction.com/cloudera) or the Whirr user list, but I
wonder if you're seeing this because you're not using the SOCKS proxy
for DNS lookups? See bottom of
https://docs.cloudera.com/display/DOC/Launching+a+Cluster.
Cheers
Tom
Hi Mike,
What do you get if you type "./hadoop classpath"? Does it contain the
Hadoop common JAR?
To avoid the deprecation warning you should use "hadoop fs", not "hadoop dfs".
Tom
On Wed, Sep 15, 2010 at 12:53 PM, Mike Franon wrote:
> Hi,
>
> I just setup 3 node hadoop cluster using the lates
Hi Sonal,
The 0.21.0 jars are not available in Maven yet, since the process for
publishing them post split has changed.
See HDFS-1292 and MAPREDUCE-1929.
Cheers,
Tom
On Fri, Sep 10, 2010 at 1:33 PM, Sonal Goyal wrote:
> Hi,
>
> Can someone please point me to the Maven repo for 0.21 release? Tha
The 0.21.0 jars are not in the Apache Maven repos yet, since the
process for publishing them post split has changed. HDFS-1292 and
MAPREDUCE-1929 are the tickets to fix this.
Cheers,
Tom
On Sat, Aug 28, 2010 at 9:10 PM, Mark wrote:
> On 8/27/10 9:25 AM, Owen O'Malley wrote:
>>
>> On Aug 27, 201
Hi everyone,
I am pleased to announce that Apache Hadoop 0.21.0 is available for
download from http://hadoop.apache.org/common/releases.html.
Over 1300 issues have been addressed since 0.20.2; you can find details at
http://hadoop.apache.org/common/docs/r0.21.0/releasenotes.html
http://hadoop.ap
Hi Oleg,
I don't know of any plans to implement this. However, since this is a
block-based storage system which uses S3, I wonder whether an
implementation could use some of the logic in HDFS for block storage
and append in general.
Cheers,
Tom
On Thu, Aug 12, 2010 at 8:34 AM, Aleshko, Oleg
wro
Hi Ananth,
The next release of Hadoop will be 0.21.0, but it won't have Kerberos
authentication in it (since it's not all in trunk yet). The 0.22.0
release later this year will have a working version of security in it.
Cheers,
Tom
On Wed, Jul 7, 2010 at 8:09 AM, Ananth Sarathy
wrote:
>
> is the
Hi Felix,
Aaron Kimball hit the same problem - it's being discussed at
https://issues.apache.org/jira/browse/MAPREDUCE-1920.
Thanks for reporting this.
Cheers,
Tom
On Tue, Jul 6, 2010 at 11:26 AM, Felix Halim wrote:
> I tried hadoop 0.21 release candidate.
>
> job.waitForCompletion(true);
> Co
Hi Mark,
You can find the latest version of the scripts at
http://archive.cloudera.com/cdh/3/hadoop-0.20.2+228.tar.gz.
Documentation is at http://archive.cloudera.com/docs/ec2.html.
The source code is currently in src/contrib/cloud in Hadoop Common,
but is in the process of moving to a new Incuba
Hi Susanne,
Hadoop uses the file extension to detect that a file is compressed. I
believe Hive does too. Did you store the compressed file in HDFS with
a .gz extension?
Cheers,
Tom
BTW It's best to send Hive questions like these to the hive-user@ list.
On Sun, May 2, 2010 at 11:22 AM, Susanne L
ame in each mapper)?
>
> Yuanyuan
>
> Tom White ---04/29/2010 09:42:44 AM---Hi Yuanyuan, I think you've found a bug
> - could you file a JIRA issue for this please?
>
>
> From:
> Tom White
> To:
> common-user@hadoop.apache.org
> Date:
> 04/29/2010 09:42 AM
Hi Yuanyuan,
I think you've found a bug - could you file a JIRA issue for this please?
Thanks,
Tom
On Wed, Apr 28, 2010 at 11:04 PM, Yuanyuan Tian wrote:
>
>
> I have a problem in getting the input file name in the mapper when uisng
> MultipleInputs. I need to use MultipleInputs to support dif
Hi Danny,
S3FileSystem has no concept of permissions, which is why this check
fails. The change that introduced the permissions check was introduced
in https://issues.apache.org/jira/browse/MAPREDUCE-181. Could you file
a bug for this please?
Cheers,
Tom
On Thu, Apr 22, 2010 at 4:16 AM, Danny Le
Have a look at org.apache.hadoop.io.MapWritable, which is a Map for
storing Writable keys and values.
Cheers,
Tom
On Thu, Apr 15, 2010 at 3:17 PM, Eric Sammer wrote:
> You need to implement a custom Writable (the serialization interface
> supported by Hadoop). If you want to use your own custom
I think you can set the URI on the configuration object with the key
JobContext.END_NOTIFICATION_URL.
Cheers,
Tom
On Tue, Feb 23, 2010 at 12:02 PM, Ted Yu wrote:
> Hi,
> I am looking for counterpart to JobConf.setJobEndNotificationURI() in
> org.apache.hadoop.mapreduce
>
> Please advise.
>
> Tha
Hi Sonal,
You should use the one with the later date. The Cloudera AMIs don't
actually have Hadoop installed on them, just Java and some other base
packages. Hadoop is installed at start up time; you can find more
information at http://archive.cloudera.com/docs/ec2.html.
Cheers,
Tom
P.S. For Clo
Hi Prasen,
2) is now in the Hadoop Common repository, in src/contrib/cloud. This
is where the development effort is focused, and the older bash scripts
(1) will be deprecated over time (HADOOP-6403). The new cloud scripts
are designed to support multiple cloud providers, as well as advanced
featur
Please submit a patch for the documentation change - perhaps at
https://issues.apache.org/jira/browse/HADOOP-5973.
Cheers,
Tom
On Wed, Jan 13, 2010 at 12:09 AM, Amogh Vasekar wrote:
> +1 for the documentation change in mapred-tutorial. Can we do that and
> publish using a normal apache account?
Have a look at org.apache.hadoop.io.ArrayWritable. You may be able to
use this class in your application, or at least use it as a basis for
writing VectorWritable.
Cheers,
Tom
On Tue, Dec 29, 2009 at 1:37 AM, bharath v
wrote:
> Can you please tell me , what is the functionality of those 2 method
If you are using S3 as your file store then you don't need to run HDFS
(and indeed HDFS will not start up if you try).
Cheers,
Tom
2009/12/17 Rekha Joshi :
> Not sure what the whole error is, but you can always alternatively try this -
>
> fs.default.name
> s3://BUCKET
>
>
>
> fs.s3.awsAcce
Hi Mark,
The root partition is small, but there is plenty of storage on the
/mnt partition. See http://aws.amazon.com/ec2/instance-types/.
Cheers,
Tom
On Wed, Nov 25, 2009 at 12:30 PM, Mark Kerzner wrote:
> Hi,
>
> I have started the Apache distribution of hadoop-0.19, and I noticed that
> this
ger instance than that of the slaves?
No, this is not supported, but I can see it would be useful,
particularly for larger clusters. Please consider opening a JIRA for
it.
Cheers,
Tom
>
> Thank you,
> Mark
>
> On Tue, Nov 24, 2009 at 11:20 PM, Tom White wrote:
>
>> Mark,
>
Correct. The master runs the namenode and jobtracker, but not a
datanode or tasktracker.
Tom
On Tue, Nov 24, 2009 at 4:57 PM, Mark Kerzner wrote:
> Hi,
>
> do I understand it correctly that, when I launch a Hadoop cluster on EC2,
> the master will not be doing any work, and it is just for organi
Mark,
If the data was transferred to S3 outside of Hadoop then you should
use the s3n filesystem scheme (see the explanation on
http://wiki.apache.org/hadoop/AmazonS3 for the differences between the
Hadoop S3 filesystems).
Also, some people have had problems embedding the secret key in the
URI, s
.@apache.org).
>
> Thank you,
> Mark
>
> On Sun, Nov 15, 2009 at 10:29 PM, Tom White wrote:
>
>> Hi Mark,
>>
>> HADOOP-6108 will add Cloudera's EC2 scripts to the Apache
>> distribution, with the difference that they will run Apache Hadoop.
>> T
Hi Mark,
HADOOP-6108 will add Cloudera's EC2 scripts to the Apache
distribution, with the difference that they will run Apache Hadoop.
The same scripts will also support Cloudera's Distribution for Hadoop,
simply by using a different boot script on the instances. So I would
suggest you use these s
MultipleInputs is available from Hadoop 0.19 onwards (in
org.apache.hadoop.mapred.lib, or org.apache.hadoop.mapreduce.lib.input
for the new API in later versions).
Tom
On Wed, Nov 4, 2009 at 8:07 AM, Mark Vigeant
wrote:
> Amogh,
>
> That sounds so awesome! Yeah I wish I had that class now. Do yo
Multiple outputs has been ported to the new API in 0.21. See
https://issues.apache.org/jira/browse/MAPREDUCE-370.
Cheers,
Tom
On Sat, Nov 7, 2009 at 6:45 AM, Xiance SI(司宪策) wrote:
> I just fall back to old mapred.* APIs, seems MultipleOutputs only works for
> the old API.
>
> wishes,
> Xiance
>
diate
workaround you can avoid calling the Hadoop cluster "default", and
make sure that you don't create non-Hadoop EC2 instances in the
cluster group.
Thanks,
Tom
>
> Does this help at all? Thanks.
>
> -Mark
>
> On Mon, Oct 19, 2009 at 11:52 AM, Tom White
Hi Mark,
Sorry to hear that all your EC2 instances were terminated. Needless to
say, this should certainly not happen.
The scripts are a Python rewrite (see HADOOP-6108) of the bash ones so
HADOOP-1504 is not applicable, but the behaviour should be the same:
the terminate-cluster command lists th
Have a look at the JobControl class - this allows you to set up chains
of job dependencies.
Tom
On Fri, Oct 2, 2009 at 11:29 AM, bharath v
wrote:
> Hi all,
>
> I have a set of map red jobs which need to be cascaded ,i.e, output of MR
> job1 is the input of MR job2. etc..
>
> Can anyone point me
On Thu, Oct 1, 2009 at 5:10 PM, Andy Sautins
wrote:
>
> Hi all. I'm struggling a bit to figure this out and wondering if anyone had
> any pointers.
>
> I'm using SequenceFiles as output from a MapReduce job ( using
> SequenceFileOutputFormat ) and then in a followup MapReduce job reading in
Hi Jeyendran,
Were there any errors reported in the datanode logs? There could be a
problem with datanodes contacting the namenode, caused by firewall
configuration problems (EC2 security groups).
Cheers,
Tom
On Fri, Sep 4, 2009 at 12:17 AM, Jeyendran
Balakrishnan wrote:
> I downloaded Hadoop 0.
Hi Cam,
Looks like it's in hadoop-hdfs-hdfswithmr-test-0.21.0-dev.jar, which
should be built with "ant jar-test".
Cheers,
Tom
On Mon, Aug 24, 2009 at 8:22 PM, Cam Macdonell wrote:
>
> Thanks Danny,
>
> It currently does not show up hadoop-common-test, hadoop-hdfs-test or
> hadoop-mapred-test wit
Hi Roman,
Have a look at CombineFileInputFormat - it might be related to what
you are trying to do.
Cheers,
Tom
On Thu, Aug 20, 2009 at 10:59 AM, roman kolcun wrote:
> On Thu, Aug 20, 2009 at 10:30 AM, Harish Mallipeddi <
> harish.mallipe...@gmail.com> wrote:
>
>> On Thu, Aug 20, 2009 at 2:39 PM
I've now updated the news section, and the documentation on the
website to reflect the 0.19.2 release.
There were several reports of it being more stable than 0.19.1 in the
voting thread:
http://www.mail-archive.com/common-...@hadoop.apache.org/msg00051.html
Cheers,
Tom
On Tue, Jul 28, 2009 at
On Mon, Aug 3, 2009 at 3:09 AM, Billy
Pearson wrote:
>
>
> not sure if its still there but there was a parm in the hadoop-site conf
> file that would allow you to skip x number if index when reading it in to
> memory.
This is io.map.index.skip (default 0), which will skip this number of
keys for e
That's for the case where you want to do the decompression yourself,
explicitly, perhaps when you are reading the data out of HDFS (and not
using MapReduce). When using compressed data as input to a MapReduce
job, Hadoop will automatically decompress them for you.
Tom
On Fri, Jul 31, 2009 at 5:3
Is this an area where the Offline Image Viewer might be able to help
in the future? It's not available for 0.18.3, but seems like it would
be possible to extend it as a tool to help with c) in Todd's
description.
Tom
On Mon, Jul 20, 2009 at 8:30 PM, Todd Lipcon wrote:
> Hi Arv,
>
> It sounds like
:45 PM, Rakhi Khatwani wrote:
> Hi Tom,
>
> in that case, can i kill the job by givin some command from the
> API?? or i ll have 2 do it frm the command line?
>
> On Mon, Jul 20, 2009 at 8:55 PM, Tom White wrote:
>
>> Hi Raakhi,
>>
>> You can't su
e any way you can suspend the job in the java program???
>
>
> Regards,
> Raakhi
>
> On Fri, Jul 17, 2009 at 2:48 PM, Tom White wrote:
>
>> Hi Raakhi,
>>
>> JobControl is designed to be run from a new thread:
>>
>> Thread t = new Thread(jobCon
Hi Raakhi,
JobControl is designed to be run from a new thread:
Thread t = new Thread(jobControl);
t.start();
Then you can run a loop to poll for job completion and print out status:
String oldStatus = null;
while (!jobControl.allFinished()) {
String status = getStatusString(jobCon
It seems that
> the org.apache.hadoop.mapred.Partitioner is deprecated and will be removed in
> the futture.
> Do you have some suggestions on this?
>
> Thanks,
> Jianmin
>
>
>
>
> ____
> From: Tom White
> To: common-user@hadoop
There's a Jira to fix this here:
https://issues.apache.org/jira/browse/MAPREDUCE-434
Tom
On Mon, Jul 13, 2009 at 12:34 AM, jason hadoop wrote:
> If the jobtracker is set to local, there is no way to have more than 1
> reducer.
>
> On Sun, Jul 12, 2009 at 12:21 PM, Rares Vernica wrote:
>
>> Hello
Hi Jianmin,
Partitioner extends JobConfigurable, so you can implement the
configure() method to access the JobConf.
Hope that helps.
Cheers,
Tom
On Tue, Jul 14, 2009 at 10:27 AM, Jianmin Woo wrote:
> Hi,
>
> I am considering to implement a Partitioner that needs to access the
> parameters in C
Hi Akhil,
Have a look at the mapred.jobtracker.restart.recover property.
Cheers,
Tom
On Sun, Jul 12, 2009 at 12:06 AM, akhil1988 wrote:
>
> HI All,
>
> I am looking for ways to restart my hadoop job from where it left when the
> entire cluster goes down or the job gets stopped due to some reason
The config looks fine, but you need to start the daemons on the
relevant servers. You will need the same config on both server1 and
server2.
On server1:
bin/start-dfs.sh
On server2:
bin/start-mapred.sh
Hope this helps.
Tom
On Tue, Jun 30, 2009 at 7:53 AM, Eason.Lee wrote:
> Just want to run
57 matches
Mail list logo