I would make a DataFrame (or DataSet) out of the RDD and use SQL join. On Wed, Apr 27, 2016 at 2:50 PM Eduardo <erocha....@gmail.com> wrote:
> Is there a way to write a transformation that for each entry of an RDD > uses certain other values of another RDD? As an example, image you have a > RDD of entries to predict a certain label. In a second RDD, you have > historical data. So for each entry in the first RDD, you want to find > similar entries in the second RDD and take, let's say, the average. Does > that fit the Spark model? Is there any alternative? > > Thanks in advance > -- Mathieu Longtin 1-514-803-8977