from:"Jeff Zhang"

How to select random n records using mapreduce ?

2011-06-27 Thread Jeff Zhang

Hi all, I'd like to select random N records from a large amount of data using hadoop, just wonder how can I archive this ? Currently my idea is that let each mapper task select N / mapper_number records. Does anyone has such experience ? -- Best Regards Jeff Zhang

Re: What is the property for setting the number of tolerated failure task in one job

2011-05-11 Thread Jeff Zhang

lure. > Amar > > > > On 5/10/11 2:02 PM, "Jeff Zhang" wrote: > > > Hi all, > > I just remember there's a property for setting the number of failure task > can been tolerated in one job. Does anyone know what's the property name ? > > -- Best Regards Jeff Zhang

What is the property for setting the number of tolerated failure task in one job

2011-05-10 Thread Jeff Zhang

Hi all, I just remember there's a property for setting the number of failure task can been tolerated in one job. Does anyone know what's the property name ? -- Best Regards Jeff Zhang

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Jeff Zhang

> In my mapper code I need to know the total number of mappers which is the > same as number of input splits. > (I need it for unique int Id generation) > > > Basically Im looking for an analog of context.getNumReduceTasks() but can't > find it. > > > Thanks > > > >> > -- Best Regards Jeff Zhang

Re: Starting a Hadoop job programtically

2010-11-25 Thread Jeff Zhang

erver A). but on Server B, I can't telnet to Server A.(The hadoop server > is running on Server A ) > If I use the netstat -a to check the port. I can't find the 9001 port. > I have no idea why I can't run the job on the other server. If anyone can > give me some suggestion, that's very appreciated. > Thanks > Best Regards > -- > -李平 > -- > -李平 > -- Best Regards Jeff Zhang

Re: Yahoo Open Source Real-Time MapReduce

2010-11-09 Thread Jeff Zhang

en, Yes, "stream process" should be more accurate than "real-time" On Tue, Nov 9, 2010 at 6:36 PM, Bibek Paudel wrote: > On Tue, Nov 9, 2010 at 10:49 AM, Jeff Zhang wrote: >> Not sure whether this has been post on this mail list. But I strongly >> feel to

Yahoo Open Source Real-Time MapReduce

2010-11-09 Thread Jeff Zhang

Not sure whether this has been post on this mail list. But I strongly feel to tell everyone here that "Yahoo Open Source Real-Time MapReduce". See http://s4.io/ for more details. And thanks again for Yahoo's contribution for open source world. -- Best Regards Jeff Zhang

Re: Job without Output files

2010-11-08 Thread Jeff Zhang

My guess is that HBase has version on cells, so inserting multiple-times is OK, not sure my guessing is correct On Mon, Nov 8, 2010 at 8:32 PM, Harsh J wrote: > Hi Jeff, > > On Mon, Nov 8, 2010 at 3:17 PM, Jeff Zhang wrote: >> Hi Harsh, >> >> you point is

Re: Job without Output files

2010-11-08 Thread Jeff Zhang

you handle speculative execution of > tasks (if it is turned on)? > > -- > Harsh J > www.harshj.com > -- Best Regards Jeff Zhang

Re: Job without Output files

2010-11-07 Thread Jeff Zhang

Thanks > -- > Regards > Shuja-ur-Rehman Baig > > > -- Best Regards Jeff Zhang

Re: help with rewriting hadoop java code for new API: RecordReader getPos()

2010-10-27 Thread Jeff Zhang

ed for the type > RecordReader > > Any pointers or help will be highly appreciated. > > Thanks, > Bibek > > [0] > http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/RecordReader.html#getPos%28%29 > [1] http://www.slideshare.net/sh1mmer/upgrading-to-the-new-map-reduce-api > -- Best Regards Jeff Zhang

Re: How to modify task assignment algorithm?

2010-10-07 Thread Jeff Zhang

10:38 AM, Shen LI wrote: > Hi, Thanks you very much for your reply. I want to run my own algorithm for > this part to see if we can achieve better outcome in specific scenario. So > how can I modify it? > Thanks a lot! > Shen > > On Thu, Oct 7, 2010 at 6:33 PM, Jeff Zhang wr

Re: How to modify task assignment algorithm?

2010-10-07 Thread Jeff Zhang

scheduler) > Big thanks, > Shen -- Best Regards Jeff Zhang

Re: Hdfs Block Size

2010-10-07 Thread Jeff Zhang

block defragmentation etc. ? > > Thanks, > -Rakesh > -- Best Regards Jeff Zhang

Re: Is Hadoop suitable for web site visitor analysis?

2010-07-08 Thread Jeff Zhang

lternative > approach. > > > Any pointers would be greatly appreciated. > > Thanks, > Tim > > > > > > -- Best Regards Jeff Zhang

Re: name of input file which has the key value pair

2010-05-17 Thread Jeff Zhang

o you believe in fate, Neo? > Neo: No. > Morpheus: Why Not? > Neo: Because I don't like the idea that I'm not in control of my life. > > > > > -- Best Regards Jeff Zhang

Re: Configured & PathFilter

2010-04-14 Thread Jeff Zhang

Kris Try use /test-batchEventLog/metrics<http://hadoop-eventlog01.socialmedia.com/test-batchEventLog/metrics> /* Append asterisk. On Wed, Apr 14, 2010 at 7:26 AM, Kris Nuttycombe wrote: > On Wed, Apr 14, 2010 at 2:16 AM, Jeff Zhang wrote: > > Hi Kris, > > > > I a

Re: Configured & PathFilter

2010-04-14 Thread Jeff Zhang

stStatus(SequenceFileInputFormat.java:55) > >> at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241) > >>at > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) > >>at > org.apach

Re: Configured & PathFilter

2010-04-13 Thread Jeff Zhang

t; >> This indicates that reflection will be used to instantiate the > >> required PathFilter object, and I need to be able to access the > >> minimum and maximum date for a given run. I don't want to have to > >> implement a separate PathFilter class for each set o

Re: Configured & PathFilter

2010-04-12 Thread Jeff Zhang

t; >> This indicates that reflection will be used to instantiate the > >> required PathFilter object, and I need to be able to access the > >> minimum and maximum date for a given run. I don't want to have to > >> implement a separate PathFilter class for each set o

Re: Configured & PathFilter

2010-04-12 Thread Jeff Zhang

have to > hard-code a separate PathFilter instance for each date range I'm > interested in, obviously. If I make my PathFilter extend Configured, > will it do the right thing? > > Thanks! > > Kris > -- Best Regards Jeff Zhang

Re: job.jar

2010-03-15 Thread Jeff Zhang

Is it possible to create a job.jar file in the bash command line? > > > PS: > I've put some posts in the MR mailing list that weren't answered. These > posts can be viewed by other users? > > > Regards > -- > Pedro > -- Best Regards Jeff Zhang

Re: How can I get system environment variable in core-site.xml

2010-03-11 Thread Jeff Zhang

oint, it's a same directory, and only can be locked once. > So that why I can't deploy Hadoop. > > Best Regards > welman Lu > -- Best Regards Jeff Zhang

Re: How can I get system environment variable in core-site.xml

2010-03-11 Thread Jeff Zhang

ame contents inside this $HOME > directory. > > I borrowed these three computers from a big cluster. And I only use ssh to > remote control them. > I am not sure what they did to this cluster, but there really a terrible > for me. > > Regards > welman Lu > -- Best Regards Jeff Zhang

Re: How can I get system environment variable in core-site.xml

2010-03-11 Thread Jeff Zhang

>> Unfortunately, I don't where I can set the codes you mentioned. >> Can you tell me more about that? >> Thanks! >> >> Regards >> welman Lu >> > > -- Best Regards Jeff Zhang

Re: How can I get system environment variable in core-site.xml

2010-03-11 Thread Jeff Zhang

> Regards > welman Lu > -- Best Regards Jeff Zhang

Re: How can I get system environment variable in core-site.xml

2010-03-11 Thread Jeff Zhang

{HOSTNAME}, ${env.hostname}, both of them can't work. > It just return the string of "${HOSTNAME}" and "${env.hostname}" > themselves. > > So can anybody tell me what I should use for get this environment? > Thank you! > > welman Lu > -- Best Regards Jeff Zhang

Re: Using SequenceFiles in Hadoop for an imaging application.

2010-01-19 Thread Jeff Zhang

{ > > JobClient.runJob(conf); > } catch (Exception e) { > e.printStackTrace(); > } > } > > > > Thanks, > > Regards, > > Suhail Rehman > MS by Research in Computer Science > International Institute of Information Technology - Hyderabad > reh...@research.iiit.ac.in > - > http://research.iiit.ac.in/~rehman <http://research.iiit.ac.in/%7Erehman> > -- Best Regards Jeff Zhang

Re: Question about setting the number of mappers.

2010-01-18 Thread Jeff Zhang

b, submitSplitFile); > } > job.set("mapred.job.split.file", submitSplitFile.toString()); > job.setNumMapTasks(maps); > > // Write job file to JobTracker's fs > FSDataOutputStream out = > FileSystem.create(fs, submitJobFile, > new FsPermission(JOB_FILE_PERMISSION)); > > try { > job.writeXml(out); > } finally { > out.close(); >. > > 737,0-1 39% > } > > > *** > > Is there anything I can do to get the number of mappers to be more > flexible? > > > Cheers, > > Teryl > > -- Best Regards Jeff Zhang

Problems on configure FairScheduler

2009-12-10 Thread Jeff Zhang

uster Although I did these work, I can not open the page http:///scheduler Did I miss something ? Thank you for any help. Jeff Zhang

Re: Question regarding wordCount example

2009-10-25 Thread Jeff Zhang

as you can Jeff zhang On Mon, Oct 26, 2009 at 6:35 AM, felix gao wrote: > Hi all, I have some question regarding how to compile a simple hadoop > program. > > setup > Java 1.6 > Ubuntu 9.02 > Hadoop 0.19.2 > > > //below is the mapper class > imp

How to select random n records using mapreduce ?

Re: What is the property for setting the number of tolerated failure task in one job

What is the property for setting the number of tolerated failure task in one job

Re: Is it pissible get a number of mapper tasks?

Re: Starting a Hadoop job programtically

Re: Yahoo Open Source Real-Time MapReduce

Yahoo Open Source Real-Time MapReduce

Re: Job without Output files

Re: Job without Output files

Re: Job without Output files

Re: help with rewriting hadoop java code for new API: RecordReader getPos()

Re: How to modify task assignment algorithm?

Re: How to modify task assignment algorithm?

Re: Hdfs Block Size

Re: Is Hadoop suitable for web site visitor analysis?

Re: name of input file which has the key value pair

Re: Configured & PathFilter

Re: Configured & PathFilter

Re: Configured & PathFilter

Re: Configured & PathFilter

Re: Configured & PathFilter

Re: job.jar

Re: How can I get system environment variable in core-site.xml

Re: How can I get system environment variable in core-site.xml

Re: How can I get system environment variable in core-site.xml

Re: How can I get system environment variable in core-site.xml

Re: How can I get system environment variable in core-site.xml

Re: Using SequenceFiles in Hadoop for an imaging application.

Re: Question about setting the number of mappers.

Problems on configure FairScheduler

Re: Question regarding wordCount example

31 matches

Site Navigation

Mail list logo

Footer information