Re: Largest input data set observed for Spark.

Surendranauth Hiraman Thu, 20 Mar 2014 11:13:33 -0700

Reynold,

How complex was that job (I guess in terms of number of transforms and
actions) and how long did that take to process?


-Suren



On Thu, Mar 20, 2014 at 2:08 PM, Reynold Xin <r...@databricks.com> wrote:

> Actually we just ran a job with 70TB+ compressed data on 28 worker nodes -
> I didn't count the size of the uncompressed data, but I am guessing it is
> somewhere between 200TB to 700TB.
>
>
>
> On Thu, Mar 20, 2014 at 12:23 AM, Usman Ghani <us...@platfora.com> wrote:
>
> > All,
> > What is the largest input data set y'all have come across that has been
> > successfully processed in production using spark. Ball park?
> >
>



-- 

SUREN HIRAMAN, VP TECHNOLOGY
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR
NEW YORK, NY 10001
O: (917) 525-2466 ext. 105
F: 646.349.4063
E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io
W: www.velos.io

Re: Largest input data set observed for Spark.

Reply via email to