Re: Any thoughts making Submarine a separate Apache project?

2019-07-18 Thread Jeff Zhang
+1, This is Jeff Zhang from Zeppelin community. Thanks Xun for bring this up. Submarine has been integrated into Zeppelin several months ago, and I already see some early adoption of that in China. AI is fast growing area, I believe moving into a separate project would be helpful for Submarine to

Bug of InputSampler

2011-04-08 Thread Jeff Zhang
ound(stepSize * i); while (last >= k && comparator.compare(samples[last], samples[k]) == 0) { ++k; } if (k>=samples.length) break; writer.append(samples[k], nullValue); last = k; } -- Best Regards Jeff Zhang

Re: What is the reason for putting the output of one mapper task into one file ?

2010-06-17 Thread Jeff Zhang
ng a few mid-to-large sized jobs (in terms or m * r). > > https://issues.apache.org/jira/browse/HADOOP-331 > > Arun > > On Jun 16, 2010, at 7:53 PM, Jeff Zhang wrote: > >> Hi all, >> >> I check the source code of Mapper Task, it seems that the output of >&

What is the reason for putting the output of one mapper task into one file ?

2010-06-16 Thread Jeff Zhang
knows the Partitioner. and the logic will be much easier. Is there any performance consideration for putting the output into one file ? Thanks. -- Best Regards Jeff Zhang

Re: Why hadoop jobs need setup and cleanup phases which would consume a lot of time ?

2010-03-10 Thread Jeff Zhang
y setups and > cleanups. What's hadoop philosophy on this? > > Thanks, > Min > -- > My research interests are distributed systems, parallel computing and > bytecode based virtual machine. > > My profile: > http://www.linkedin.com/in/coderplay > My blog: > http://coderplay.javaeye.com > -- Best Regards Jeff Zhang

Fwd: What is the biggest problem of extremely large hadoop cluster ?

2010-02-21 Thread Jeff Zhang
-- Forwarded message -- From: Jeff Zhang Date: Sun, Feb 21, 2010 at 7:49 AM Subject: What is the biggest problem of extremely large hadoop cluster ? To: hdfs-...@hadoop.apache.org Hi , I am curious to know what is the biggest problem of extremely large hadoop cluster. What I

Re: Why not making InputSplit implements interface Writable ?

2010-02-06 Thread Jeff Zhang
t serialization. You can see > where splits are serialized at JobSplitWriter.writeNewSplits() and > deserialized on the task node at MapTask.getSplitDetails(). This is in > contrast to the old API which mandated that InputSplits had to be > Writable. > > Cheers, > Tom > > O

Why not making InputSplit implements interface Writable ?

2010-02-05 Thread Jeff Zhang
Split. -- Best Regards Jeff Zhang

Re: The idea to enhance MapReduce to resolve the skew problem

2010-02-04 Thread Jeff Zhang
If too many values for a key, adding a new MapReduce procedure is > necessary. > If too many keys hash to a partition, resplitting is necessary. > > If every splitting is balanced, we can consider a task (map or reduce) to a > scheduler timeslice, the scheduler will be smart like OS's scheduler. > -- Best Regards Jeff Zhang

Re: Ideas for dynamic change reducer task number ?

2009-11-26 Thread Jeff Zhang
Owen, It works, thank you for your help. Jeff Zhang On Tue, Nov 24, 2009 at 8:36 AM, Jeff Zhang wrote: > > You're right, I will try that. > > Thank you > > > Jeff Zhang > > > > On Mon, Nov 23, 2009 at 9:19 AM, Owen O'Malley wrote: > >&

Re: Ideas for dynamic change reducer task number ?

2009-11-23 Thread Jeff Zhang
You're right, I will try that. Thank you Jeff Zhang On Mon, Nov 23, 2009 at 9:19 AM, Owen O'Malley wrote: > > On Nov 22, 2009, at 4:48 PM, Jeff Zhang wrote: > > My concern is that it is just like hard code to use conf.setNumReduceTasks >> on the configuratio

Re: Ideas for dynamic change reducer task number ?

2009-11-22 Thread Jeff Zhang
Owen, My concern is that it is just like hard code to use conf.setNumReduceTasks on the configuration. It is not flexible, so my idea is that adding an interface to change the reducer number dynamically according the different size of input data set. Jeff Zhang On Sun, Nov 22, 2009 at 11:10

Ideas for dynamic change reducer task number ?

2009-11-22 Thread Jeff Zhang
, they only need to provide their customized implementation. This is my initial idea, looking forward to hear from experts’ feedback. Thank you Jeff Zhang

How can I get the permission to contribute to hadoop ?

2009-10-24 Thread Jeff Zhang
Hi all, I'd like to contribute the hadoop, and I'd like to get started with fixing bugs. But I found in the jira, it says that I have no permission to work on the jira item. So how can I get the permission to contribute to hadoop ? Thank you. Jeff Zhang