Any update? On Fri, Aug 7, 2015 at 10:13 AM, Supun Sethunga <[email protected]> wrote:
> Hi Danula, > > Sorry I couldn't join the meeting. Can you please share the meeting/review > notes? Also the progress on the suggestions and what is left to be done in > overall? > > Thanks, > Supun > > On Wed, Aug 5, 2015 at 3:47 AM, Nirmal Fernando <[email protected]> wrote: > >> Hi Danula, >> >> It should be a JavaRDD<String[]>, where each row represents the feature >> vector as a string[]. >> >> On Tue, Aug 4, 2015 at 11:51 AM, Danula Eranjith <[email protected]> >> wrote: >> >>> In other words, >>> What would be the preferred output type for a dataset which is >>> pre-processed by wrangler? >>> As I have observed different algorithms use different JavaRDD types as >>> input ( JavaRDD<String>, JavaRDD<Vector> etc ) >>> >>> On Tue, Aug 4, 2015 at 11:48 AM, Nirmal Fernando <[email protected]> >>> wrote: >>> >>>> Hi Danula, >>>> >>>> On Tue, Aug 4, 2015 at 11:47 AM, Danula Eranjith <[email protected]> >>>> wrote: >>>> >>>>> Hi Nirmal, >>>>> >>>>> In ML, what is the preferred way of keeping data in a single row of >>>>> JavaRDD? >>>>> >>>> >>>> I didn't quite get your question. Can you elaborate please? >>>> >>>> >>>>> >>>>> As I have figured it depends on the algorithm being used. >>>>> >>>>> Danula >>>>> >>>>> On Thu, Jul 30, 2015 at 9:14 AM, Nirmal Fernando <[email protected]> >>>>> wrote: >>>>> >>>>>> Thanks Danula, I'll send an invite. >>>>>> >>>>>> On Wed, Jul 29, 2015 at 10:24 PM, Danula Eranjith < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Nirmal, >>>>>>> >>>>>>> I am available after 1.30pm on Tuesday, Wednesday and Thursday. >>>>>>> >>>>>>> Danula >>>>>>> >>>>>>> On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Danula, >>>>>>>> >>>>>>>> Can we arrange a demo/review somewhere next week? Please let me >>>>>>>> know few time slots. >>>>>>>> >>>>>>>> On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks Danula. >>>>>>>>> >>>>>>>>> On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> You can find the source at [1] >>>>>>>>>> <https://github.com/danula/wso2-ml-wrangler-integration>. I have >>>>>>>>>> to do some refactoring when integrating to ML. >>>>>>>>>> >>>>>>>>>> [1] - https://github.com/danula/wso2-ml-wrangler-integration >>>>>>>>>> >>>>>>>>>> On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks Danula. Please share the current code, if possible. >>>>>>>>>>> >>>>>>>>>>> On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I have succeeded in parsing the operations from wrangler >>>>>>>>>>>> javascript code to spark transformations I have written. Working on >>>>>>>>>>>> automating the process. >>>>>>>>>>>> >>>>>>>>>>>> Last couple of steps would be changing the wrangler interface >>>>>>>>>>>> and integrating it into ML Wizard. >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> Danula >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>> >>>>>>>>>>>>> Could you please summarize the current status of the project >>>>>>>>>>>>> and also the things left to do? >>>>>>>>>>>>> >>>>>>>>>>>>> On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>> Will use them. I already have some other kaggle datasets as >>>>>>>>>>>>>> well. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sun, Jul 19, 2015 at 11:30 PM, Danula Eranjith < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Nirmal, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Would it be possible to get some sample data sets which are >>>>>>>>>>>>>>>> more likely to be pre-processed using wrangler. I am currently >>>>>>>>>>>>>>>> testing my >>>>>>>>>>>>>>>> implementations against small and more general data sets. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have checked datasets available at [1] >>>>>>>>>>>>>>>> <https://github.com/wso2/product-ml/tree/master/modules/samples> >>>>>>>>>>>>>>>> as >>>>>>>>>>>>>>>> well. But there is nothing much to be processed as they are >>>>>>>>>>>>>>>> ready to be fed >>>>>>>>>>>>>>>> to ML. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [1] - >>>>>>>>>>>>>>>> https://github.com/wso2/product-ml/tree/master/modules/samples >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks Danula. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Sorry for not keeping you in the loop. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> After considering and experimenting with several options. >>>>>>>>>>>>>>>>>> I am using the javascript code generated by wrangler to >>>>>>>>>>>>>>>>>> implement them >>>>>>>>>>>>>>>>>> using spark. I have used regular expressions to extract the >>>>>>>>>>>>>>>>>> operations, >>>>>>>>>>>>>>>>>> parameters and values and mapped them to spark >>>>>>>>>>>>>>>>>> transformations I previously >>>>>>>>>>>>>>>>>> developed. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The code generated by wrangler for certain functions have >>>>>>>>>>>>>>>>>> nested operations. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> (1) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /* Fill split3 with values from above */ >>>>>>>>>>>>>>>>>> w.add(dw.fill().column(["split3"]) >>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>> .direction("down") >>>>>>>>>>>>>>>>>> .method("copy") >>>>>>>>>>>>>>>>>> .row(undefined) >>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> (2) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /* Delete rows where split1 is null */ >>>>>>>>>>>>>>>>>> w.add(dw.filter().column([]) >>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>> .row(dw.row().column([]) >>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>> .conditions([dw.is_null().column([]) >>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>> .lcol("split1") >>>>>>>>>>>>>>>>>> .value(undefined) >>>>>>>>>>>>>>>>>> .op_str("is null") >>>>>>>>>>>>>>>>>> ]) >>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have succeeded in parsing the operations similar to (1) >>>>>>>>>>>>>>>>>> above and currently working on extending it to work on >>>>>>>>>>>>>>>>>> operations similar >>>>>>>>>>>>>>>>>> to (2). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Next step would be automating the process of spark >>>>>>>>>>>>>>>>>> transformation generation. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please send an update at least every week. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Any update on the progress? Were you managed to >>>>>>>>>>>>>>>>>>>> integrate the transformations with the wrangler? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Update on the current progress of the project and >>>>>>>>>>>>>>>>>>>>> future activities as we discussed at the recent meeting. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> *Current Progress* >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I have completed the phase of creating spark >>>>>>>>>>>>>>>>>>>>> transformations relevant to operations available in >>>>>>>>>>>>>>>>>>>>> wrangler. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Operations implemented >>>>>>>>>>>>>>>>>>>>> - Fill >>>>>>>>>>>>>>>>>>>>> - Split >>>>>>>>>>>>>>>>>>>>> - Drop >>>>>>>>>>>>>>>>>>>>> - Delete >>>>>>>>>>>>>>>>>>>>> - Extract >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> *Future activities* >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - Modify the wrangler interface to suit the current >>>>>>>>>>>>>>>>>>>>> implementation >>>>>>>>>>>>>>>>>>>>> - Automate the process of generating Spark >>>>>>>>>>>>>>>>>>>>> transformations >>>>>>>>>>>>>>>>>>>>> - Integrating wrangler to the ML workflow >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith < >>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> No, We haven't done a review yet. >>>>>>>>>>>>>>>>>>>>>> It would be great if we could have one so that I can >>>>>>>>>>>>>>>>>>>>>> discuss with you all and clarify the next steps of the >>>>>>>>>>>>>>>>>>>>>> implementation as >>>>>>>>>>>>>>>>>>>>>> you mentioned. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga < >>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Did we have a review for the work done so far? If >>>>>>>>>>>>>>>>>>>>>>> not, shall we have a one? We can clear out any doubts >>>>>>>>>>>>>>>>>>>>>>> and issues as well.. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando < >>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the update, keep them coming. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On a JavaRDD you can perform a collect() to get a >>>>>>>>>>>>>>>>>>>>>>>> list, AFAIR. Yes, this is costly, since it would load >>>>>>>>>>>>>>>>>>>>>>>> whole dataset into >>>>>>>>>>>>>>>>>>>>>>>> memory. So, is this an operation which involves >>>>>>>>>>>>>>>>>>>>>>>> multiple rows? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith < >>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi Supun, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I modified the "Fill" operation to add what you >>>>>>>>>>>>>>>>>>>>>>>>> mentioned. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I used a workaround to to implement certain parts >>>>>>>>>>>>>>>>>>>>>>>>> of the operations such as filling with values from >>>>>>>>>>>>>>>>>>>>>>>>> rows above and below. >>>>>>>>>>>>>>>>>>>>>>>>> I created a List Implementation using toArray() >>>>>>>>>>>>>>>>>>>>>>>>> method in JavaRDD and then converted it back to a >>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD after the >>>>>>>>>>>>>>>>>>>>>>>>> operation. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> This will be inefficient (in terms of both memory >>>>>>>>>>>>>>>>>>>>>>>>> and time) when working with very large data sets. But >>>>>>>>>>>>>>>>>>>>>>>>> I think its important >>>>>>>>>>>>>>>>>>>>>>>>> to have these features included. Otherwise a user >>>>>>>>>>>>>>>>>>>>>>>>> would be left with very >>>>>>>>>>>>>>>>>>>>>>>>> limited set of operations. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have a different opinion >>>>>>>>>>>>>>>>>>>>>>>>> on this. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga < >>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Somehow there are issues in implementing certain >>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler functions due to limitations in JavaRDD >>>>>>>>>>>>>>>>>>>>>>>>>>> used in spark >>>>>>>>>>>>>>>>>>>>>>>>>>> e.g. - >>>>>>>>>>>>>>>>>>>>>>>>>>> Fill operation - when filling with values from >>>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below >>>>>>>>>>>>>>>>>>>>>>>>>>> Fold operation >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Agree, since rows will get executed randomly with >>>>>>>>>>>>>>>>>>>>>>>>>> spark, inter-row operations are not very meaningful. >>>>>>>>>>>>>>>>>>>>>>>>>> But you can slightly modify the implementation of >>>>>>>>>>>>>>>>>>>>>>>>>> the "Fill" operation, such as, to fill values based >>>>>>>>>>>>>>>>>>>>>>>>>> on an >>>>>>>>>>>>>>>>>>>>>>>>>> expression/static-value/mean etc. (not depending on >>>>>>>>>>>>>>>>>>>>>>>>>> other rows).. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga < >>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry for the late reply. Have you got the >>>>>>>>>>>>>>>>>>>>>>>>>>> details you were looking for? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> It would be great if I could get to know which >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler operations are important for a user of >>>>>>>>>>>>>>>>>>>>>>>>>>>> the ML >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Other than the ones you have mentioned in the >>>>>>>>>>>>>>>>>>>>>>>>>>> proposal, think its better to have "Translate" >>>>>>>>>>>>>>>>>>>>>>>>>>> operation as well (to create a new column based on >>>>>>>>>>>>>>>>>>>>>>>>>>> an existing column). >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith >>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I am currently working on generating spark >>>>>>>>>>>>>>>>>>>>>>>>>>>> transformations related to the operations >>>>>>>>>>>>>>>>>>>>>>>>>>>> available in the data wrangler. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Data wrangler provides sufficient parameters to >>>>>>>>>>>>>>>>>>>>>>>>>>>> re-create these at spark.I have successfully >>>>>>>>>>>>>>>>>>>>>>>>>>>> implemented delete and split >>>>>>>>>>>>>>>>>>>>>>>>>>>> operations of wrangler in spark. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Once this phase is completed, I can either >>>>>>>>>>>>>>>>>>>>>>>>>>>> directly generate these scripts at wrangler or use >>>>>>>>>>>>>>>>>>>>>>>>>>>> the javascript output >>>>>>>>>>>>>>>>>>>>>>>>>>>> and convert it to spark depending on the >>>>>>>>>>>>>>>>>>>>>>>>>>>> implementation. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Somehow there are issues in implementing >>>>>>>>>>>>>>>>>>>>>>>>>>>> certain wrangler functions due to limitations in >>>>>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD used in spark >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g. - >>>>>>>>>>>>>>>>>>>>>>>>>>>> Fill operation - when filling with values from >>>>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below >>>>>>>>>>>>>>>>>>>>>>>>>>>> Fold operation >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> It would be great if I could get to know which >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler operations are important for a user of >>>>>>>>>>>>>>>>>>>>>>>>>>>> the ML >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando >>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please send an update of your work thus far. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, May 10, 2015 at 2:30 PM, Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fernando <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Welcome to GSoC 15' ! Can you do some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> research on directly generating spark >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> transformations using Wrangler and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> come up with a summary ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 8, 2015 at 11:03 AM, Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eranjith <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for selecting my proposal [1] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for GSoC 2015. I am really looking forward to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> work with you all and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contribute to WSO2. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have already completed my primary research >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on wrangler and would like to meet you to get >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> feedback on the proposed >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> architecture. I am planning to start working on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the project before 25th of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> May. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] - >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Team, WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Team, WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, >>>>>>>>>>>>>>>>>>>>>>>> WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 >>>>>>>>>>>>>>>>>>> Inc. >>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 >>>>>>>>>>>>>>>>> Inc. >>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>> Nirmal >>>>>>>>>>>>> >>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Thanks & regards, >>>>>>>>>>> Nirmal >>>>>>>>>>> >>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Thanks & regards, >>>>>>>>> Nirmal >>>>>>>>> >>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>> Mobile: +94715779733 >>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks & regards, >>>>>>>> Nirmal >>>>>>>> >>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>> Mobile: +94715779733 >>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks & regards, >>>>>> Nirmal >>>>>> >>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>> Mobile: +94715779733 >>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Thanks & regards, >>>> Nirmal >>>> >>>> Team Lead - WSO2 Machine Learner >>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>> Mobile: +94715779733 >>>> Blog: http://nirmalfdo.blogspot.com/ >>>> >>>> >>>> >>> >> >> >> -- >> >> Thanks & regards, >> Nirmal >> >> Team Lead - WSO2 Machine Learner >> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >> Mobile: +94715779733 >> Blog: http://nirmalfdo.blogspot.com/ >> >> >> > > > -- > *Supun Sethunga* > Software Engineer > WSO2, Inc. > http://wso2.com/ > lean | enterprise | middleware > Mobile : +94 716546324 > -- *Supun Sethunga* Software Engineer WSO2, Inc. http://wso2.com/ lean | enterprise | middleware Mobile : +94 716546324
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
