How can you verify that it is loading only the part of time and network in filter ?
2016-07-07 11:58 GMT+02:00 Chanh Le <giaosu...@gmail.com>: > Hi Tan, > It depends on how data organise and what your filter is. > For example in my case: I store data by partition by field time and > network_id. If I filter by time or network_id or both and with other field > Spark only load part of time and network in filter then filter the rest. > > > > > On Jul 7, 2016, at 4:43 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > Does the filter under consideration operate on sorted column(s) ? > > > > Cheers > > > >> On Jul 7, 2016, at 2:25 AM, tan shai <tan.shai...@gmail.com> wrote: > >> > >> Hi, > >> > >> I have a sorted dataframe, I need to optimize the filter operations. > >> How does Spark performs filter operations on sorted dataframe? > >> > >> It is scanning all the data? > >> > >> Many thanks. > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > > >