I collected small DF to array of tuple3 Then I registered UDF with function which is doing lookup in the array Then I just run select which uses the UDF. On Dec 18, 2015 1:06 AM, "Akhil Das" <ak...@sigmoidanalytics.com> wrote:
> You can broadcast your json data and then do a map side join. This article > is a good start http://dmtolpeko.com/2015/02/20/map-side-join-in-spark/ > > Thanks > Best Regards > > On Wed, Dec 16, 2015 at 2:51 AM, Alexander Pivovarov <apivova...@gmail.com > > wrote: > >> I have big folder having ORC files. Files have duration field (e.g. >> 3,12,26, etc) >> Also I have small json file (just 8 rows) with ranges definition (min, >> max , name) >> 0, 10, A >> 10, 20, B >> 20, 30, C >> etc >> >> Because I can not do equi-join btw duration and range min/max I need to >> do cross join and apply WHERE condition to take records which belong to the >> range >> Cross join is an expensive operation I think that it's better if this >> particular join done using Map Join >> >> How to do Map join in Spark Sql? >> > >