Hi all,
I'd like to select random N records from a large amount of data using
hadoop, just wonder how can I archive this ? Currently my idea is that let
each mapper task select N / mapper_number records. Does anyone has such
experience ?
--
Best Regards
Jeff Zhang
lure.
> Amar
>
>
>
> On 5/10/11 2:02 PM, "Jeff Zhang" wrote:
>
>
> Hi all,
>
> I just remember there's a property for setting the number of failure task
> can been tolerated in one job. Does anyone know what's the property name ?
>
>
--
Best Regards
Jeff Zhang
Hi all,
I just remember there's a property for setting the number of failure task
can been tolerated in one job. Does anyone know what's the property name ?
--
Best Regards
Jeff Zhang
> In my mapper code I need to know the total number of mappers which is the
> same as number of input splits.
> (I need it for unique int Id generation)
>
>
> Basically Im looking for an analog of context.getNumReduceTasks() but can't
> find it.
>
>
> Thanks
>
>
>
>>
>
--
Best Regards
Jeff Zhang
erver A). but on Server B, I can't telnet to Server A.(The hadoop server
> is running on Server A )
> If I use the netstat -a to check the port. I can't find the 9001 port.
> I have no idea why I can't run the job on the other server. If anyone can
> give me some suggestion, that's very appreciated.
> Thanks
> Best Regards
> --
> -李平
> --
> -李平
>
--
Best Regards
Jeff Zhang
en, Yes, "stream process" should be more accurate than "real-time"
On Tue, Nov 9, 2010 at 6:36 PM, Bibek Paudel wrote:
> On Tue, Nov 9, 2010 at 10:49 AM, Jeff Zhang wrote:
>> Not sure whether this has been post on this mail list. But I strongly
>> feel to
Not sure whether this has been post on this mail list. But I strongly
feel to tell everyone here that "Yahoo Open Source Real-Time
MapReduce". See http://s4.io/ for more details.
And thanks again for Yahoo's contribution for open source world.
--
Best Regards
Jeff Zhang
My guess is that HBase has version on cells, so inserting
multiple-times is OK, not sure my guessing is correct
On Mon, Nov 8, 2010 at 8:32 PM, Harsh J wrote:
> Hi Jeff,
>
> On Mon, Nov 8, 2010 at 3:17 PM, Jeff Zhang wrote:
>> Hi Harsh,
>>
>> you point is
you handle speculative execution of
> tasks (if it is turned on)?
>
> --
> Harsh J
> www.harshj.com
>
--
Best Regards
Jeff Zhang
Thanks
> --
> Regards
> Shuja-ur-Rehman Baig
>
>
>
--
Best Regards
Jeff Zhang
ed for the type
> RecordReader
>
> Any pointers or help will be highly appreciated.
>
> Thanks,
> Bibek
>
> [0]
> http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/RecordReader.html#getPos%28%29
> [1] http://www.slideshare.net/sh1mmer/upgrading-to-the-new-map-reduce-api
>
--
Best Regards
Jeff Zhang
10:38 AM, Shen LI wrote:
> Hi, Thanks you very much for your reply. I want to run my own algorithm for
> this part to see if we can achieve better outcome in specific scenario. So
> how can I modify it?
> Thanks a lot!
> Shen
>
> On Thu, Oct 7, 2010 at 6:33 PM, Jeff Zhang wr
scheduler)
> Big thanks,
> Shen
--
Best Regards
Jeff Zhang
block defragmentation etc. ?
>
> Thanks,
> -Rakesh
>
--
Best Regards
Jeff Zhang
lternative
> approach.
>
>
> Any pointers would be greatly appreciated.
>
> Thanks,
> Tim
>
>
>
>
>
>
--
Best Regards
Jeff Zhang
o you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
>
>
--
Best Regards
Jeff Zhang
Kris
Try use
/test-batchEventLog/metrics<http://hadoop-eventlog01.socialmedia.com/test-batchEventLog/metrics>
/*
Append asterisk.
On Wed, Apr 14, 2010 at 7:26 AM, Kris Nuttycombe
wrote:
> On Wed, Apr 14, 2010 at 2:16 AM, Jeff Zhang wrote:
> > Hi Kris,
> >
> > I a
stStatus(SequenceFileInputFormat.java:55)
> >> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
> >>at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> >>at
> org.apach
t; >> This indicates that reflection will be used to instantiate the
> >> required PathFilter object, and I need to be able to access the
> >> minimum and maximum date for a given run. I don't want to have to
> >> implement a separate PathFilter class for each set o
t; >> This indicates that reflection will be used to instantiate the
> >> required PathFilter object, and I need to be able to access the
> >> minimum and maximum date for a given run. I don't want to have to
> >> implement a separate PathFilter class for each set o
have to
> hard-code a separate PathFilter instance for each date range I'm
> interested in, obviously. If I make my PathFilter extend Configured,
> will it do the right thing?
>
> Thanks!
>
> Kris
>
--
Best Regards
Jeff Zhang
Is it possible to create a job.jar file in the bash command line?
>
>
> PS:
> I've put some posts in the MR mailing list that weren't answered. These
> posts can be viewed by other users?
>
>
> Regards
> --
> Pedro
>
--
Best Regards
Jeff Zhang
oint, it's a same directory, and only can be locked once.
> So that why I can't deploy Hadoop.
>
> Best Regards
> welman Lu
>
--
Best Regards
Jeff Zhang
ame contents inside this $HOME
> directory.
>
> I borrowed these three computers from a big cluster. And I only use ssh to
> remote control them.
> I am not sure what they did to this cluster, but there really a terrible
> for me.
>
> Regards
> welman Lu
>
--
Best Regards
Jeff Zhang
>> Unfortunately, I don't where I can set the codes you mentioned.
>> Can you tell me more about that?
>> Thanks!
>>
>> Regards
>> welman Lu
>>
>
>
--
Best Regards
Jeff Zhang
> Regards
> welman Lu
>
--
Best Regards
Jeff Zhang
{HOSTNAME}, ${env.hostname}, both of them can't work.
> It just return the string of "${HOSTNAME}" and "${env.hostname}"
> themselves.
>
> So can anybody tell me what I should use for get this environment?
> Thank you!
>
> welman Lu
>
--
Best Regards
Jeff Zhang
{
>
> JobClient.runJob(conf);
> } catch (Exception e) {
> e.printStackTrace();
> }
> }
>
>
>
> Thanks,
>
> Regards,
>
> Suhail Rehman
> MS by Research in Computer Science
> International Institute of Information Technology - Hyderabad
> reh...@research.iiit.ac.in
> -
> http://research.iiit.ac.in/~rehman <http://research.iiit.ac.in/%7Erehman>
>
--
Best Regards
Jeff Zhang
b, submitSplitFile);
> }
> job.set("mapred.job.split.file", submitSplitFile.toString());
> job.setNumMapTasks(maps);
>
> // Write job file to JobTracker's fs
> FSDataOutputStream out =
> FileSystem.create(fs, submitJobFile,
> new FsPermission(JOB_FILE_PERMISSION));
>
> try {
> job.writeXml(out);
> } finally {
> out.close();
>.
>
> 737,0-1 39%
> }
>
>
> ***
>
> Is there anything I can do to get the number of mappers to be more
> flexible?
>
>
> Cheers,
>
> Teryl
>
>
--
Best Regards
Jeff Zhang
uster
Although I did these work, I can not open the page http:///scheduler
Did I miss something ?
Thank you for any help.
Jeff Zhang
as you can
Jeff zhang
On Mon, Oct 26, 2009 at 6:35 AM, felix gao wrote:
> Hi all, I have some question regarding how to compile a simple hadoop
> program.
>
> setup
> Java 1.6
> Ubuntu 9.02
> Hadoop 0.19.2
>
>
> //below is the mapper class
> imp
31 matches
Mail list logo