Hi Mich, The input and output are just for example and it s not exact column name. Colc not needed.
The code which I shared is working fine but need to confirm, was it right approach and effect performance. Thanks, Selvam R +91-97877-87724 On Aug 16, 2016 5:18 PM, "Mich Talebzadeh" <mich.talebza...@gmail.com> wrote: > Hi Selvan, > > is table called sel,? > > And are these assumptions correct? > > site -> ColA > requests -> ColB > > I don't think you are using ColC here? > > HTH > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 16 August 2016 at 12:06, Selvam Raman <sel...@gmail.com> wrote: > >> Hi All, >> >> Please suggest me the best approach to achieve result. [ Please comment >> if the existing logic is fine or not] >> >> Input Record : >> >> ColA ColB ColC >> 1 2 56 >> 1 2 46 >> 1 3 45 >> 1 5 34 >> 1 5 90 >> 2 1 89 >> 2 5 45 >> >> Expected Result >> >> ResA ResB >> 1 2:2|3:3|5:5 >> 2 1:1|5:5 >> >> I followd the below Spark steps >> >> (Spark version - 1.5.0) >> >> def valsplit(elem :scala.collection.mutable.WrappedArray[String]) : >> String = >> { >> >> elem.map(e => e+":"+e).mkString("|") >> } >> >> sqlContext.udf.register("valudf",valsplit(_:scala.collection >> .mutable.WrappedArray[String])) >> >> >> val x =sqlContext.sql("select site,valudf(collect_set(requests)) as test >> from sel group by site").first >> >> >> >> -- >> Selvam Raman >> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து" >> > >