+1, This is Jeff Zhang from Zeppelin community.
Thanks Xun for bring this up. Submarine has been integrated into Zeppelin
several months ago, and I already see some early adoption of that in China.
AI is fast growing area, I believe moving into a separate project would be
helpful for Submarine to
ound(stepSize * i);
while (last >= k && comparator.compare(samples[last], samples[k]) ==
0) {
++k;
}
if (k>=samples.length)
break;
writer.append(samples[k], nullValue);
last = k;
}
--
Best Regards
Jeff Zhang
ng a few mid-to-large sized jobs (in terms or m * r).
>
> https://issues.apache.org/jira/browse/HADOOP-331
>
> Arun
>
> On Jun 16, 2010, at 7:53 PM, Jeff Zhang wrote:
>
>> Hi all,
>>
>> I check the source code of Mapper Task, it seems that the output of
>&
knows the Partitioner. and the
logic will be much easier. Is there any performance consideration for
putting the output into one file ? Thanks.
--
Best Regards
Jeff Zhang
y setups and
> cleanups. What's hadoop philosophy on this?
>
> Thanks,
> Min
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>
--
Best Regards
Jeff Zhang
-- Forwarded message --
From: Jeff Zhang
Date: Sun, Feb 21, 2010 at 7:49 AM
Subject: What is the biggest problem of extremely large hadoop cluster ?
To: hdfs-...@hadoop.apache.org
Hi ,
I am curious to know what is the biggest problem of extremely large hadoop
cluster. What I
t serialization. You can see
> where splits are serialized at JobSplitWriter.writeNewSplits() and
> deserialized on the task node at MapTask.getSplitDetails(). This is in
> contrast to the old API which mandated that InputSplits had to be
> Writable.
>
> Cheers,
> Tom
>
> O
Split.
--
Best Regards
Jeff Zhang
If too many values for a key, adding a new MapReduce procedure is
> necessary.
> If too many keys hash to a partition, resplitting is necessary.
>
> If every splitting is balanced, we can consider a task (map or reduce) to a
> scheduler timeslice, the scheduler will be smart like OS's scheduler.
>
--
Best Regards
Jeff Zhang
Owen,
It works, thank you for your help.
Jeff Zhang
On Tue, Nov 24, 2009 at 8:36 AM, Jeff Zhang wrote:
>
> You're right, I will try that.
>
> Thank you
>
>
> Jeff Zhang
>
>
>
> On Mon, Nov 23, 2009 at 9:19 AM, Owen O'Malley wrote:
>
>&
You're right, I will try that.
Thank you
Jeff Zhang
On Mon, Nov 23, 2009 at 9:19 AM, Owen O'Malley wrote:
>
> On Nov 22, 2009, at 4:48 PM, Jeff Zhang wrote:
>
> My concern is that it is just like hard code to use conf.setNumReduceTasks
>> on the configuratio
Owen,
My concern is that it is just like hard code to use conf.setNumReduceTasks
on the configuration. It is not flexible, so my idea is that adding an
interface to change the reducer number dynamically according the different
size of input data set.
Jeff Zhang
On Sun, Nov 22, 2009 at 11:10
, they only need to provide their customized
implementation.
This is my initial idea, looking forward to hear from experts’ feedback.
Thank you
Jeff Zhang
Hi all,
I'd like to contribute the hadoop, and I'd like to get started with fixing
bugs. But I found in the jira, it says that I have no permission to work on
the jira item.
So how can I get the permission to contribute to hadoop ?
Thank you.
Jeff Zhang
14 matches
Mail list logo