Re: Data frame Performance

2016-08-16 Thread Selvam Raman
Hi Mich,

The input and output are just for example and it s not exact column name.
Colc not needed.

The code which I shared is working fine but need to confirm, was it right
approach and effect performance.

Thanks,
Selvam R
+91-97877-87724
On Aug 16, 2016 5:18 PM, "Mich Talebzadeh" 
wrote:

> Hi Selvan,
>
> is table called sel,?
>
> And are these assumptions correct?
>
> site -> ColA
> requests -> ColB
>
> I don't think you are using ColC here?
>
> HTH
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 16 August 2016 at 12:06, Selvam Raman  wrote:
>
>> Hi All,
>>
>> Please suggest me the best approach to achieve result. [ Please comment
>> if the existing logic is fine or not]
>>
>> Input Record :
>>
>> ColA ColB ColC
>> 1 2 56
>> 1 2 46
>> 1 3 45
>> 1 5 34
>> 1 5 90
>> 2 1 89
>> 2 5 45
>> ​
>> Expected Result
>>
>> ResA ResB
>> 12:2|3:3|5:5
>> 2   1:1|5:5
>>
>> I followd the below Spark steps
>>
>> (Spark version - 1.5.0)
>>
>> def valsplit(elem :scala.collection.mutable.WrappedArray[String]) :
>> String =
>> {
>>
>> elem.map(e => e+":"+e).mkString("|")
>> }
>>
>> sqlContext.udf.register("valudf",valsplit(_:scala.collection
>> .mutable.WrappedArray[String]))
>>
>>
>> val x =sqlContext.sql("select site,valudf(collect_set(requests)) as test
>> from sel group by site").first
>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>
>


Re: Data frame Performance

2016-08-16 Thread Mich Talebzadeh
Hi Selvan,

is table called sel,?

And are these assumptions correct?

site -> ColA
requests -> ColB

I don't think you are using ColC here?

HTH



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 16 August 2016 at 12:06, Selvam Raman  wrote:

> Hi All,
>
> Please suggest me the best approach to achieve result. [ Please comment if
> the existing logic is fine or not]
>
> Input Record :
>
> ColA ColB ColC
> 1 2 56
> 1 2 46
> 1 3 45
> 1 5 34
> 1 5 90
> 2 1 89
> 2 5 45
> ​
> Expected Result
>
> ResA ResB
> 12:2|3:3|5:5
> 2   1:1|5:5
>
> I followd the below Spark steps
>
> (Spark version - 1.5.0)
>
> def valsplit(elem :scala.collection.mutable.WrappedArray[String]) :
> String =
> {
>
> elem.map(e => e+":"+e).mkString("|")
> }
>
> sqlContext.udf.register("valudf",valsplit(_:scala.collection.mutable.
> WrappedArray[String]))
>
>
> val x =sqlContext.sql("select site,valudf(collect_set(requests)) as test
> from sel group by site").first
>
>
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>


Data frame Performance

2016-08-16 Thread Selvam Raman
Hi All,

Please suggest me the best approach to achieve result. [ Please comment if
the existing logic is fine or not]

Input Record :

ColA ColB ColC
1 2 56
1 2 46
1 3 45
1 5 34
1 5 90
2 1 89
2 5 45
​
Expected Result

ResA ResB
12:2|3:3|5:5
2   1:1|5:5

I followd the below Spark steps

(Spark version - 1.5.0)

def valsplit(elem :scala.collection.mutable.WrappedArray[String]) : String
=
{

elem.map(e => e+":"+e).mkString("|")
}

sqlContext.udf.register("valudf",valsplit(_:scala.collection.mutable.WrappedArray[String]))


val x =sqlContext.sql("select site,valudf(collect_set(requests)) as test
from sel group by site").first



-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"