hat do aggregations on samples of
> the data (cf. https://jornfranke.wordpress.com/2015/06/28/big-
> data-what-is-next-oltp-olap-predictive-analytics-sampling-
> and-probabilistic-databases). E.g. Hive has a tablesample functionality
> since a long time.
>
> On 5 Mar 2017, at 21:49, Allan R
Hi,
I am looking to use Spark to help execute queries against a reasonably
large dataset (1 billion rows). I'm a bit lost with all the different
libraries / add ons to Spark, and am looking for some direction as to what
I should look at / what may be helpful.
A couple of relevant points:
- The