Re: Apache design patterns

2016-06-09 Thread Daniel Siegmann
On Tue, Jun 7, 2016 at 11:43 PM, Francois Le Roux wrote: > 1. Should I use dataframes to ‘pull the source data? If so, do I do > a groupby and order by as part of the SQL query? > Seems reasonable. If you use Scala you might want to define a case class and convert

Re: Apache design patterns

2016-06-07 Thread Francois Le Roux
Thanks Ted Hi I have been working through some examples tutorials for Apache Spark in an attempt to establish how I would solve the following scenario (see data examples in Appendix): I have 1 billion+ rows that have a key value (i.e. driver ID) and a number of relevant attributes (product

Re: Apache design patterns

2016-06-07 Thread Ted Yu
I think this is the correct forum. Please describe your case. > On Jun 7, 2016, at 8:33 PM, Francois Le Roux wrote: > > HI folks, I have been working through the available online Apache spark > tutorials and I am stuck with a scenario that i would like to solve in

Apache design patterns

2016-06-07 Thread Francois Le Roux
HI folks, I have been working through the available online Apache spark tutorials and I am stuck with a scenario that i would like to solve in SPARK. Is this a forum where i can publish a narrative for the problem / scenario that i am trying to solve ? any assitance appreciated thanks frank