date:20090603

Re: Renaming all nodes in Hadoop cluster

2009-06-03 Thread Raghu Angadi

Renaming datanodes should not affect HDFS. HDFS does not depend on hostname or ip for consistency of data. You can try renaming a few of the nodes. Of course, if you rename NameNode, you need to update the config file to reflect that. Stuart White wrote: Is it possible to rename all

Re: problem getting map input filename

2009-06-03 Thread Sharad Agarwal

conf.get(map.input.file) should work. If not, then it is a bug in new mapreduce api in 0.20 - Sharad

Image indexing/searching with Hadoop and MPI

2009-06-03 Thread tog

Hi there, This is a kind of newbie question (at least as far as Hadoop is concerned). I was wondering if they were any Hadoop based project around dealing with Image indexing and searching ? We are working is this area and might be interesting to have a look in such a project. Second question is

Re: Image indexing/searching with Hadoop and MPI

2009-06-03 Thread Edward J. Yoon

This is a kind of newbie question (at least as far as Hadoop is concerned). I was wondering if they were any Hadoop based project around dealing with Image indexing and searching ? We are working is this area and might be interesting to have a look in such a project. There is a text-search

Hadoop ReInitialization.

2009-06-03 Thread b

Hello all. I need to process many Gigs of new data each 10 minutes. Each 10 minutes cron launches bash script do.sh that puts data into HDFS and launches processing. But... Hadoop isn't military software, so there is probability of errors with HDFS. So i need to watch LOG files to catch

Re: Hadoop ReInitialization.

2009-06-03 Thread Steve Loughran

b wrote: But after formatting and starting DFS i need to wait some time (sleep 60) before putting data into HDFS. Else i will receive NotReplicatedYetException. that means the namenode is up but there aren't enough workers yet.

Opera Software AS - Job Opening: Hadoop Engineer

2009-06-03 Thread Usman Waheed

Greetings All, Opera Software AS (www.opera.com) in Oslo/Norway is looking for an experienced Hadoop Engineer to join the Statistics Team in order to provide business intelligence metrics both internally and to our customers. If you have the experience and are willing to relocate to beautiful

Re: Subdirectory question revisited

2009-06-03 Thread David Rosenstrauch

OK, thanks for the pointer. If I wind up rolling our own code to handle this I'll make sure to contribute it. DR Aaron Kimball wrote: There is no technical limit that prevents Hadoop from operating in this fashion; it's simply the case that the included InputFormat implementations do not do

Re: problem getting map input filename

2009-06-03 Thread Rares Vernica

On 6/2/09, jason hadoop jason.had...@gmail.com wrote: you can always dump the entire property space and work it out that way. I dumped the property space and I could only find mapred.input.dir. There was no mapred.input.file. -- Rares

Re: Image indexing/searching with Hadoop and MPI

2009-06-03 Thread tog

On Wed, Jun 3, 2009 at 5:17 PM, Edward J. Yoon edwardy...@apache.orgwrote: This is a kind of newbie question (at least as far as Hadoop is concerned). I was wondering if they were any Hadoop based project around dealing with Image indexing and searching ? We are working is this area and

Re: problem getting map input filename

2009-06-03 Thread He Yongqiang

take a look at HADOOP-5368, :) On 09-6-4 上午12:27, Rares Vernica rvern...@gmail.com wrote: On 6/2/09, jason hadoop jason.had...@gmail.com wrote: you can always dump the entire property space and work it out that way. I dumped the property space and I could only find mapred.input.dir.

Command-line jobConf options in 0.18.3

2009-06-03 Thread Ian Soboroff

I'm backporting some code I wrote for 0.19.1 to 0.18.3 (long story), and I'm finding that when I run a job and try to pass options with -D on the command line, that the option values aren't showing up in my JobConf. I logged all the key/value pairs in the JobConf, and the option I passed

Sharing object between mappers on same node (reuse.jvm ?)

2009-06-03 Thread Tarandeep Singh

Hi, I want to share a object (Lucene Index Writer Instance) between mappers running on same node of 1 job (not across multiple jobs). Please correct me if I am wrong - If I set the -1 for the property: mapred.job.reuse.jvm.num.tasks then all mappers of one job will be executed in the same jvm

Fastlz coming?

2009-06-03 Thread Kris Jirapinyo

Hi all, In the remove lzo JIRA ticket https://issues.apache.org/jira/browse/HADOOP-4874 Tatu mentioned he was going to port fastlz from C to Java and provide a patch. Has there been any updates on that? Or is anyone working on any additional custom compression codecs? Thanks, Kris J.

Re: problem getting map input filename

2009-06-03 Thread Rares Vernica

On 6/3/09, He Yongqiang heyongqi...@software.ict.ac.cn wrote: take a look at HADOOP-5368, :) There you set the map.input.file, I think it should already be set by Hadoop.

State of Eclipse Plugin

2009-06-03 Thread ANithian

Hi all, I am not sure if this is the right mailing list but I was wondering the state of the eclipse plugin for Hadoop. I have found it very valuable in my M/R development but have posted, seen and fixed a few bugs but I haven't seen any response in JIRA. Is anyone still using or maintaining the

Re: Seattle / PNW Hadoop + Lucene User Group?

2009-06-03 Thread Bradford Stephens

Hey everyone! I just wanted to give a BIG THANKS for everyone who came. We had over a dozen people, and a few got lost at UW :) [I would have sent this update earlier, but I flew to Florida the day after the meeting]. If you didn't come, you missed quite a bit of learning and topics. Such as:

*.gz input files

2009-06-03 Thread Adam Silberstein

Hi, I have some hadoop code that works properly when the input files are not compressed, but it is not working for the gzipped versions of those files. My files are named with *.gz, but the format is not being recognized. I'm under the impression I don't need to set any JobConf parameters to

Re: *.gz input files

2009-06-03 Thread Alex Loddengaard

Hi Adam, Gzipped files don't play that nicely with Hadoop, because they aren't splittable. Can you use bzip2 instead? bzip2 files play more nicely with Hadoop, because they're splittable. If you're stuck with gzip, then take a look here: http://issues.apache.org/jira/browse/HADOOP-437. I

Re: Seattle / PNW Hadoop + Lucene User Group?

2009-06-03 Thread Bhupesh Bansal

Great Bradford, Can you post some videos if you have some ? Best Bhupesh On 6/3/09 11:58 AM, Bradford Stephens bradfordsteph...@gmail.com wrote: Hey everyone! I just wanted to give a BIG THANKS for everyone who came. We had over a dozen people, and a few got lost at UW :) [I would have

streaming a binary processing file

2009-06-03 Thread openresearch

Hi all, I have a urgent question regarding processing binary (image) data using Hadoop streaming. I am looking for simplest solution, preferably without making change to hadoop and/or streaming package. I got some hints from this mailing list, including using customized InputFormat, or

Re: Seattle / PNW Hadoop + Lucene User Group?

2009-06-03 Thread Bradford Stephens

Sorry, no videos this time. The conversation wasn't very structured... next month I'll record it :) On Wed, Jun 3, 2009 at 1:59 PM, Bhupesh Bansal bban...@linkedin.com wrote: Great Bradford, Can you post some videos if you have some ? Best Bhupesh On 6/3/09 11:58 AM, Bradford Stephens

Re: streaming a binary processing file

2009-06-03 Thread Zak Stone

One simple solution is to use Dumbo, a Python interface to Hadoop that supports binary streaming: http://wiki.github.com/klbostee/dumbo Zak On Wed, Jun 3, 2009 at 5:18 PM, openresearch qiming...@openresearchinc.com wrote: Hi all, I have a urgent question regarding processing binary (image)

Re: How do I convert DataInput and ResultSet to array of String?

2009-06-03 Thread Aaron Kimball

The text serializer will pull out an entire string by using a null terminator at the end. If you need to know the number of string objects, though, you'll have to serialize that before the strings, then use a for loop to decode the rest of them. - Aaron On Tue, Jun 2, 2009 at 6:01 PM, dealmaker

Re: Do I need to implement Readfields and Write Functions If I have Only One Field?

2009-06-03 Thread Aaron Kimball

If you can use an existing serializeable type to hold that field (e.g., if it's an integer, then use IntWritable) then you can just get away with that. If you are specifying your own class for a key or value class, then yes, the class must implement readFields() and write(). There's no concept of

Re: Hadoop ReInitialization.

2009-06-03 Thread Aaron Kimball

You can block for safemode exit by running 'hadoop dfsadmin -safemode wait' rather than sleeping for an arbitrary amount of time. More generally, I'm a bit confused what you mean by all this. Hadoop daemons may individually crash, but you should never need to reformat HDFS and start from scratch.

Re: Command-line jobConf options in 0.18.3

2009-06-03 Thread Aaron Kimball

Are you running your program via ToolRunner.run()? How do you instantiate the JobConf object? - Aaron On Wed, Jun 3, 2009 at 10:19 AM, Ian Soboroff ian.sobor...@nist.gov wrote: I'm backporting some code I wrote for 0.19.1 to 0.18.3 (long story), and I'm finding that when I run a job and try to

Re: Do I need to implement Readfields and Write Functions If I have Only One Field?

2009-06-03 Thread dealmaker

I have the following as my type of my value object. Do I need to implement readfields and write functions? private static class StringArrayWritable extends ArrayWritable { private StringArrayWritable (String [] aSString) { super (aSString); } } Aaron Kimball-3 wrote:

Re: Command-line jobConf options in 0.18.3

2009-06-03 Thread Ian Soboroff

Yes, and I get the JobConf via 'JobConf job = new JobConf(conf, the.class)'. The conf is the Configuration object that comes from getConf. Pretty much copied from the WordCount example (which this program used to be a long while back...) thanks, Ian On Jun 3, 2009, at 7:09 PM, Aaron

Re: Command-line jobConf options in 0.18.3

2009-06-03 Thread Ian Soboroff

If after I call getConf to get the conf object, I manually add the key/ value pair, it's there when I need it. So it feels like ToolRunner isn't parsing my args for some reason. Ian On Jun 3, 2009, at 8:45 PM, Ian Soboroff wrote: Yes, and I get the JobConf via 'JobConf job = new

Re: question about when shuffle/sort start working

2009-06-03 Thread Jianmin Woo

Thanks for your information, Sharad. Do you have some sample on the re-usage of static variables? Thanks, Jianmin From: sharad agarwal shara...@yahoo-inc.com To: core-user@hadoop.apache.org Sent: Wednesday, June 3, 2009 12:55:55 AM Subject: Re: question about

Re: question about when shuffle/sort start working

2009-06-03 Thread Jianmin Woo

Thanks a lot for your suggestions on the interplay between job and the driver, Chuck. Yes, the job may hold some , say, training data, which is needed in each round of the job. I will check the link you provided. Actually, I am thinking some really light-weight map/reduce jobs. For example,

Task files in _temporary not getting promoted out

2009-06-03 Thread Ian Soboroff

Ok, help. I am trying to create local task outputs in my reduce job, and they get created, then go poof when the job's done. My first take was to use FileOutputFormat.getWorkOutputPath, and create directories in there for my outputs (which are Lucene indexes). Exasperated, I then wrote a

Re: streaming a binary processing file

2009-06-03 Thread Sharad Agarwal

Binary support has been added for 0.21. One option is to wait for 0.21 to get released, or you might try applying the patch from HADOOP-1722. - Sharad

Re: question about when shuffle/sort start working

2009-06-03 Thread Sharad Agarwal

Jianmin Woo wrote: Do you have some sample on the re-usage of static variables? You can define static variables in your Mapper/Reducer class. Static variables will survive till the jvm is live. So multiple tasks of same job running in a single jvm would able to share those. - Sharad

Re: Renaming all nodes in Hadoop cluster

Re: problem getting map input filename

Image indexing/searching with Hadoop and MPI

Re: Image indexing/searching with Hadoop and MPI

Hadoop ReInitialization.

Re: Hadoop ReInitialization.

Opera Software AS - Job Opening: Hadoop Engineer

Re: Subdirectory question revisited

Re: problem getting map input filename

Re: Image indexing/searching with Hadoop and MPI

Re: problem getting map input filename

Command-line jobConf options in 0.18.3

Sharing object between mappers on same node (reuse.jvm ?)

Fastlz coming?

Re: problem getting map input filename

State of Eclipse Plugin

Re: Seattle / PNW Hadoop + Lucene User Group?

*.gz input files

Re: *.gz input files

Re: Seattle / PNW Hadoop + Lucene User Group?

streaming a binary processing file

Re: Seattle / PNW Hadoop + Lucene User Group?

Re: streaming a binary processing file

Re: How do I convert DataInput and ResultSet to array of String?

Re: Do I need to implement Readfields and Write Functions If I have Only One Field?

Re: Hadoop ReInitialization.

Re: Command-line jobConf options in 0.18.3

Re: Do I need to implement Readfields and Write Functions If I have Only One Field?

Re: Command-line jobConf options in 0.18.3

Re: Command-line jobConf options in 0.18.3

Re: question about when shuffle/sort start working

Re: question about when shuffle/sort start working

Task files in _temporary not getting promoted out

Re: streaming a binary processing file

Re: question about when shuffle/sort start working

35 matches

Site Navigation

Mail list logo

Footer information