Re: Introduce Uniffle : A stability solution of Hive's shuffle

He Qi Thu, 03 Aug 2023 23:21:40 -0700

Thanks, We're testing the Tez in production envrionment. 
The biggest issue is to add ability of recompute. 
You can see https://github.com/apache/incubator-uniffle/issues/1011


In the future, we want to make Hive won't rely on the disks. We can make 
Shuffle Server to sort the data and flush the data to HDFS. Reducer will merge 
the files on HDFS.

On 2023/07/20 10:08:05 Okumin wrote:
> Hi Rory,
> 
> Let me express my gratitude and positive impression of Uniffle.
> Actually, we also feel the necessity of a shuffle service for our Hive
> deployment, and I've been watching the project. I will check the
> implementation for Tez and send feedback or PRs if I find something.
> 
> Regards,
> Okumin
> 
> On Tue, Jul 11, 2023 at 9:48 PM roryqi <ror...@apache.org> wrote:
> 
> > Dear Apache Hive community,
> >
> >
> > We are delighted to announce the support of Tez on Uniffle.  Uniffle havs
> > supported Apache Spark, Apache,Hadoop MapReduce and Apache Tez.
> >
> > Uniffle is a remote shuffle service. In several situations, Uniffle will
> > provide great help.
> >
> >    1. If you use AWS spot instances or mix resources, tasks may be
> >    preempted. It will be great if we store shuffle data in the Uniffle and
> > we
> >    can deploy Uniffle on some stable resource. It will improve the
> > stability
> >    of tasks. If tasks are preempted, we won’t recompute tasks if we store
> >    shuffle in the Uniffle.
> >    2. For large shuffle jobs, Uniffle can reduce random IO for the jobs.
> >    Uniffle can improve the performance of jobs. For 1TB MapReduce
> > Terasort, 1w
> >    map tasks, 1w reduce tasks, job performance will increase 30%.
> >
> > We also welcome pull requests and are eager to see how you might use
> > Uniffle to make Hive more user-friendly. More information, you can access
> > https://github.com/apache/incubator-uniffle
> >
> >
> > Best
> >
> > Rory
> >
>

Re: Introduce Uniffle : A stability solution of Hive's shuffle

Reply via email to