How can you verify that it is loading only the part of time and network in
filter ?

2016-07-07 11:58 GMT+02:00 Chanh Le <giaosu...@gmail.com>:

> Hi Tan,
> It depends on how data organise and what your filter is.
> For example in my case: I store data by partition by field time and
> network_id. If I filter by time or network_id or both and with other field
> Spark only load part of time and network in filter then filter the rest.
>
>
>
> > On Jul 7, 2016, at 4:43 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> >
> > Does the filter under consideration operate on sorted column(s) ?
> >
> > Cheers
> >
> >> On Jul 7, 2016, at 2:25 AM, tan shai <tan.shai...@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I have a sorted dataframe, I need to optimize the filter operations.
> >> How does Spark performs filter operations on sorted dataframe?
> >>
> >> It is scanning all the data?
> >>
> >> Many thanks.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >
>
>

Reply via email to