Renaming datanodes should not affect HDFS. HDFS does not depend on
hostname or ip for consistency of data. You can try renaming a few of
the nodes.
Of course, if you rename NameNode, you need to update the config file to
reflect that.
Stuart White wrote:
Is it possible to rename all
conf.get(map.input.file) should work. If not, then it is a bug in new
mapreduce api in 0.20
- Sharad
Hi there,
This is a kind of newbie question (at least as far as Hadoop is concerned).
I was wondering if they were any Hadoop based project around dealing with
Image indexing and searching ? We are working is this area and might be
interesting to have a look in such a project.
Second question is
This is a kind of newbie question (at least as far as Hadoop is concerned).
I was wondering if they were any Hadoop based project around dealing with
Image indexing and searching ? We are working is this area and might be
interesting to have a look in such a project.
There is a text-search
Hello all.
I need to process many Gigs of new data each 10 minutes. Each 10 minutes
cron launches bash script do.sh that puts data into HDFS and launches
processing. But...
Hadoop isn't military software, so there is probability of errors with
HDFS. So i need to watch LOG files to catch
b wrote:
But after formatting and starting DFS i need to wait some time (sleep
60) before putting data into HDFS. Else i will receive
NotReplicatedYetException.
that means the namenode is up but there aren't enough workers yet.
Greetings All,
Opera Software AS (www.opera.com) in Oslo/Norway is looking for an
experienced Hadoop Engineer to join the Statistics Team in order to
provide business intelligence metrics both internally and to our customers.
If you have the experience and are willing to relocate to beautiful
OK, thanks for the pointer.
If I wind up rolling our own code to handle this I'll make sure to
contribute it.
DR
Aaron Kimball wrote:
There is no technical limit that prevents Hadoop from operating in this
fashion; it's simply the case that the included InputFormat implementations
do not do
On 6/2/09, jason hadoop jason.had...@gmail.com wrote:
you can always dump the entire property space and work it out that way.
I dumped the property space and I could only find mapred.input.dir.
There was no mapred.input.file.
--
Rares
On Wed, Jun 3, 2009 at 5:17 PM, Edward J. Yoon edwardy...@apache.orgwrote:
This is a kind of newbie question (at least as far as Hadoop is
concerned).
I was wondering if they were any Hadoop based project around dealing with
Image indexing and searching ? We are working is this area and
take a look at HADOOP-5368, :)
On 09-6-4 上午12:27, Rares Vernica rvern...@gmail.com wrote:
On 6/2/09, jason hadoop jason.had...@gmail.com wrote:
you can always dump the entire property space and work it out that way.
I dumped the property space and I could only find mapred.input.dir.
I'm backporting some code I wrote for 0.19.1 to 0.18.3 (long story),
and I'm finding that when I run a job and try to pass options with -D
on the command line, that the option values aren't showing up in my
JobConf. I logged all the key/value pairs in the JobConf, and the
option I passed
Hi,
I want to share a object (Lucene Index Writer Instance) between mappers
running on same node of 1 job (not across multiple jobs). Please correct me
if I am wrong -
If I set the -1 for the property: mapred.job.reuse.jvm.num.tasks then all
mappers of one job will be executed in the same jvm
Hi all,
In the remove lzo JIRA ticket
https://issues.apache.org/jira/browse/HADOOP-4874 Tatu mentioned he was
going to port fastlz from C to Java and provide a patch. Has there been any
updates on that? Or is anyone working on any additional custom compression
codecs?
Thanks,
Kris J.
On 6/3/09, He Yongqiang heyongqi...@software.ict.ac.cn wrote:
take a look at HADOOP-5368, :)
There you set the map.input.file, I think it should already be set by Hadoop.
Hi all,
I am not sure if this is the right mailing list but I was wondering the
state of the eclipse plugin for Hadoop. I have found it very valuable in my
M/R development but have posted, seen and fixed a few bugs but I haven't
seen any response in JIRA. Is anyone still using or maintaining the
Hey everyone!
I just wanted to give a BIG THANKS for everyone who came. We had over a
dozen people, and a few got lost at UW :) [I would have sent this update
earlier, but I flew to Florida the day after the meeting].
If you didn't come, you missed quite a bit of learning and topics. Such as:
Hi,
I have some hadoop code that works properly when the input files are not
compressed, but it is not working for the gzipped versions of those
files. My files are named with *.gz, but the format is not being
recognized. I'm under the impression I don't need to set any JobConf
parameters to
Hi Adam,
Gzipped files don't play that nicely with Hadoop, because they aren't
splittable. Can you use bzip2 instead? bzip2 files play more nicely with
Hadoop, because they're splittable. If you're stuck with gzip, then take a
look here: http://issues.apache.org/jira/browse/HADOOP-437. I
Great Bradford,
Can you post some videos if you have some ?
Best
Bhupesh
On 6/3/09 11:58 AM, Bradford Stephens bradfordsteph...@gmail.com wrote:
Hey everyone!
I just wanted to give a BIG THANKS for everyone who came. We had over a
dozen people, and a few got lost at UW :) [I would have
Hi all,
I have a urgent question regarding processing binary (image) data using
Hadoop streaming.
I am looking for simplest solution, preferably without making change to
hadoop and/or streaming package.
I got some hints from this mailing list, including using customized
InputFormat, or
Sorry, no videos this time. The conversation wasn't very structured... next
month I'll record it :)
On Wed, Jun 3, 2009 at 1:59 PM, Bhupesh Bansal bban...@linkedin.com wrote:
Great Bradford,
Can you post some videos if you have some ?
Best
Bhupesh
On 6/3/09 11:58 AM, Bradford Stephens
One simple solution is to use Dumbo, a Python interface to Hadoop that
supports binary streaming:
http://wiki.github.com/klbostee/dumbo
Zak
On Wed, Jun 3, 2009 at 5:18 PM, openresearch
qiming...@openresearchinc.com wrote:
Hi all,
I have a urgent question regarding processing binary (image)
The text serializer will pull out an entire string by using a null
terminator at the end.
If you need to know the number of string objects, though, you'll have to
serialize that before the strings, then use a for loop to decode the rest of
them.
- Aaron
On Tue, Jun 2, 2009 at 6:01 PM, dealmaker
If you can use an existing serializeable type to hold that field (e.g., if
it's an integer, then use IntWritable) then you can just get away with that.
If you are specifying your own class for a key or value class, then yes, the
class must implement readFields() and write().
There's no concept of
You can block for safemode exit by running 'hadoop dfsadmin -safemode wait'
rather than sleeping for an arbitrary amount of time.
More generally, I'm a bit confused what you mean by all this. Hadoop daemons
may individually crash, but you should never need to reformat HDFS and start
from scratch.
Are you running your program via ToolRunner.run()? How do you instantiate
the JobConf object?
- Aaron
On Wed, Jun 3, 2009 at 10:19 AM, Ian Soboroff ian.sobor...@nist.gov wrote:
I'm backporting some code I wrote for 0.19.1 to 0.18.3 (long story), and
I'm finding that when I run a job and try to
I have the following as my type of my value object. Do I need to implement
readfields and write functions?
private static class StringArrayWritable extends ArrayWritable {
private StringArrayWritable (String [] aSString) {
super (aSString);
}
}
Aaron Kimball-3 wrote:
Yes, and I get the JobConf via 'JobConf job = new JobConf(conf,
the.class)'. The conf is the Configuration object that comes from
getConf. Pretty much copied from the WordCount example (which this
program used to be a long while back...)
thanks,
Ian
On Jun 3, 2009, at 7:09 PM, Aaron
If after I call getConf to get the conf object, I manually add the key/
value pair, it's there when I need it. So it feels like ToolRunner
isn't parsing my args for some reason.
Ian
On Jun 3, 2009, at 8:45 PM, Ian Soboroff wrote:
Yes, and I get the JobConf via 'JobConf job = new
Thanks for your information, Sharad.
Do you have some sample on the re-usage of static variables?
Thanks,
Jianmin
From: sharad agarwal shara...@yahoo-inc.com
To: core-user@hadoop.apache.org
Sent: Wednesday, June 3, 2009 12:55:55 AM
Subject: Re: question about
Thanks a lot for your suggestions on the interplay between job and the driver,
Chuck.
Yes, the job may hold some , say, training data, which is needed in each round
of the job. I will check the link you provided. Actually, I am thinking some
really light-weight map/reduce jobs. For example,
Ok, help. I am trying to create local task outputs in my reduce job,
and they get created, then go poof when the job's done.
My first take was to use FileOutputFormat.getWorkOutputPath, and
create directories in there for my outputs (which are Lucene
indexes). Exasperated, I then wrote a
Binary support has been added for 0.21. One option is to wait for 0.21 to get
released, or you might try applying the patch from HADOOP-1722.
- Sharad
Jianmin Woo wrote:
Do you have some sample on the re-usage of static variables?
You can define static variables in your Mapper/Reducer class. Static variables
will survive till the jvm is live. So multiple tasks of same job running in a
single jvm would able to share those.
- Sharad
35 matches
Mail list logo