Hi Supun, Following points were discussed in the meeting
*Integration to ML* We decided to add the wrangler interface as the first step considering the current ML implementation. So the steps from a users perspective would be as follows - A sample from the dataset will be sent to wrangler interface. - User can apply desired operations in the wrangler interface - User can return to ML by clicking an button in the interface. - Viewing the script will be optional for the user. - When returned to ML, spark transformations are automatically generated and applied to the dataset. *Spark Transformations* I have implemented all the wrangler transformations by extending a single abstract class. These operations are invoked by parsing the javascript code generated by wrangler. However since ML spark transformations are applied all together at the end of the process, I have to persist all the parameters and keep operations as a list which can be invoked later. Nirmal pointed out that this could be achieved by using chain of responsibility design pattern. I am currently changing the implementation accordingly. I will get back to you and Nirmal when automation process is completed to start the integration. Regards, Danula On Mon, Aug 10, 2015 at 9:29 PM, Supun Sethunga <sup...@wso2.com> wrote: > Any update? > > On Fri, Aug 7, 2015 at 10:13 AM, Supun Sethunga <sup...@wso2.com> wrote: > >> Hi Danula, >> >> Sorry I couldn't join the meeting. Can you please share the >> meeting/review notes? Also the progress on the suggestions and what is left >> to be done in overall? >> >> Thanks, >> Supun >> >> On Wed, Aug 5, 2015 at 3:47 AM, Nirmal Fernando <nir...@wso2.com> wrote: >> >>> Hi Danula, >>> >>> It should be a JavaRDD<String[]>, where each row represents the feature >>> vector as a string[]. >>> >>> On Tue, Aug 4, 2015 at 11:51 AM, Danula Eranjith <hmdanu...@gmail.com> >>> wrote: >>> >>>> In other words, >>>> What would be the preferred output type for a dataset which is >>>> pre-processed by wrangler? >>>> As I have observed different algorithms use different JavaRDD types as >>>> input ( JavaRDD<String>, JavaRDD<Vector> etc ) >>>> >>>> On Tue, Aug 4, 2015 at 11:48 AM, Nirmal Fernando <nir...@wso2.com> >>>> wrote: >>>> >>>>> Hi Danula, >>>>> >>>>> On Tue, Aug 4, 2015 at 11:47 AM, Danula Eranjith <hmdanu...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Nirmal, >>>>>> >>>>>> In ML, what is the preferred way of keeping data in a single row of >>>>>> JavaRDD? >>>>>> >>>>> >>>>> I didn't quite get your question. Can you elaborate please? >>>>> >>>>> >>>>>> >>>>>> As I have figured it depends on the algorithm being used. >>>>>> >>>>>> Danula >>>>>> >>>>>> On Thu, Jul 30, 2015 at 9:14 AM, Nirmal Fernando <nir...@wso2.com> >>>>>> wrote: >>>>>> >>>>>>> Thanks Danula, I'll send an invite. >>>>>>> >>>>>>> On Wed, Jul 29, 2015 at 10:24 PM, Danula Eranjith < >>>>>>> hmdanu...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi Nirmal, >>>>>>>> >>>>>>>> I am available after 1.30pm on Tuesday, Wednesday and Thursday. >>>>>>>> >>>>>>>> Danula >>>>>>>> >>>>>>>> On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando <nir...@wso2.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Danula, >>>>>>>>> >>>>>>>>> Can we arrange a demo/review somewhere next week? Please let me >>>>>>>>> know few time slots. >>>>>>>>> >>>>>>>>> On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando <nir...@wso2.com >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> Thanks Danula. >>>>>>>>>> >>>>>>>>>> On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith < >>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> You can find the source at [1] >>>>>>>>>>> <https://github.com/danula/wso2-ml-wrangler-integration>. I >>>>>>>>>>> have to do some refactoring when integrating to ML. >>>>>>>>>>> >>>>>>>>>>> [1] - https://github.com/danula/wso2-ml-wrangler-integration >>>>>>>>>>> >>>>>>>>>>> On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando < >>>>>>>>>>> nir...@wso2.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks Danula. Please share the current code, if possible. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith < >>>>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I have succeeded in parsing the operations from wrangler >>>>>>>>>>>>> javascript code to spark transformations I have written. Working >>>>>>>>>>>>> on >>>>>>>>>>>>> automating the process. >>>>>>>>>>>>> >>>>>>>>>>>>> Last couple of steps would be changing the wrangler interface >>>>>>>>>>>>> and integrating it into ML Wizard. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> Danula >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando < >>>>>>>>>>>>> nir...@wso2.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could you please summarize the current status of the project >>>>>>>>>>>>>> and also the things left to do? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith < >>>>>>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>>> Will use them. I already have some other kaggle datasets as >>>>>>>>>>>>>>> well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Sun, Jul 19, 2015 at 11:30 PM, Danula Eranjith < >>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Nirmal, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Would it be possible to get some sample data sets which >>>>>>>>>>>>>>>>> are more likely to be pre-processed using wrangler. I am >>>>>>>>>>>>>>>>> currently testing >>>>>>>>>>>>>>>>> my implementations against small and more general data sets. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have checked datasets available at [1] >>>>>>>>>>>>>>>>> <https://github.com/wso2/product-ml/tree/master/modules/samples> >>>>>>>>>>>>>>>>> as >>>>>>>>>>>>>>>>> well. But there is nothing much to be processed as they are >>>>>>>>>>>>>>>>> ready to be fed >>>>>>>>>>>>>>>>> to ML. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] - >>>>>>>>>>>>>>>>> https://github.com/wso2/product-ml/tree/master/modules/samples >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando < >>>>>>>>>>>>>>>>> nir...@wso2.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks Danula. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith < >>>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Sorry for not keeping you in the loop. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> After considering and experimenting with several >>>>>>>>>>>>>>>>>>> options. I am using the javascript code generated by >>>>>>>>>>>>>>>>>>> wrangler to implement >>>>>>>>>>>>>>>>>>> them using spark. I have used regular expressions to >>>>>>>>>>>>>>>>>>> extract the >>>>>>>>>>>>>>>>>>> operations, parameters and values and mapped them to spark >>>>>>>>>>>>>>>>>>> transformations >>>>>>>>>>>>>>>>>>> I previously developed. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The code generated by wrangler for certain functions >>>>>>>>>>>>>>>>>>> have nested operations. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> (1) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> /* Fill split3 with values from above */ >>>>>>>>>>>>>>>>>>> w.add(dw.fill().column(["split3"]) >>>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>>> .direction("down") >>>>>>>>>>>>>>>>>>> .method("copy") >>>>>>>>>>>>>>>>>>> .row(undefined) >>>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> (2) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> /* Delete rows where split1 is null */ >>>>>>>>>>>>>>>>>>> w.add(dw.filter().column([]) >>>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>>> .row(dw.row().column([]) >>>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>>> .conditions([dw.is_null().column([]) >>>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>>> .lcol("split1") >>>>>>>>>>>>>>>>>>> .value(undefined) >>>>>>>>>>>>>>>>>>> .op_str("is null") >>>>>>>>>>>>>>>>>>> ]) >>>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have succeeded in parsing the operations similar to >>>>>>>>>>>>>>>>>>> (1) above and currently working on extending it to work on >>>>>>>>>>>>>>>>>>> operations >>>>>>>>>>>>>>>>>>> similar to (2). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Next step would be automating the process of spark >>>>>>>>>>>>>>>>>>> transformation generation. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando < >>>>>>>>>>>>>>>>>>> nir...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please send an update at least every week. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga < >>>>>>>>>>>>>>>>>>>> sup...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Any update on the progress? Were you managed to >>>>>>>>>>>>>>>>>>>>> integrate the transformations with the wrangler? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith < >>>>>>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Update on the current progress of the project and >>>>>>>>>>>>>>>>>>>>>> future activities as we discussed at the recent meeting. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> *Current Progress* >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I have completed the phase of creating spark >>>>>>>>>>>>>>>>>>>>>> transformations relevant to operations available in >>>>>>>>>>>>>>>>>>>>>> wrangler. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Operations implemented >>>>>>>>>>>>>>>>>>>>>> - Fill >>>>>>>>>>>>>>>>>>>>>> - Split >>>>>>>>>>>>>>>>>>>>>> - Drop >>>>>>>>>>>>>>>>>>>>>> - Delete >>>>>>>>>>>>>>>>>>>>>> - Extract >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> *Future activities* >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - Modify the wrangler interface to suit the current >>>>>>>>>>>>>>>>>>>>>> implementation >>>>>>>>>>>>>>>>>>>>>> - Automate the process of generating Spark >>>>>>>>>>>>>>>>>>>>>> transformations >>>>>>>>>>>>>>>>>>>>>> - Integrating wrangler to the ML workflow >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith < >>>>>>>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> No, We haven't done a review yet. >>>>>>>>>>>>>>>>>>>>>>> It would be great if we could have one so that I can >>>>>>>>>>>>>>>>>>>>>>> discuss with you all and clarify the next steps of the >>>>>>>>>>>>>>>>>>>>>>> implementation as >>>>>>>>>>>>>>>>>>>>>>> you mentioned. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga < >>>>>>>>>>>>>>>>>>>>>>> sup...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Did we have a review for the work done so far? If >>>>>>>>>>>>>>>>>>>>>>>> not, shall we have a one? We can clear out any doubts >>>>>>>>>>>>>>>>>>>>>>>> and issues as well.. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando < >>>>>>>>>>>>>>>>>>>>>>>> nir...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the update, keep them coming. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On a JavaRDD you can perform a collect() to get a >>>>>>>>>>>>>>>>>>>>>>>>> list, AFAIR. Yes, this is costly, since it would load >>>>>>>>>>>>>>>>>>>>>>>>> whole dataset into >>>>>>>>>>>>>>>>>>>>>>>>> memory. So, is this an operation which involves >>>>>>>>>>>>>>>>>>>>>>>>> multiple rows? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith < >>>>>>>>>>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Supun, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I modified the "Fill" operation to add what you >>>>>>>>>>>>>>>>>>>>>>>>>> mentioned. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I used a workaround to to implement certain parts >>>>>>>>>>>>>>>>>>>>>>>>>> of the operations such as filling with values from >>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below. >>>>>>>>>>>>>>>>>>>>>>>>>> I created a List Implementation using toArray() >>>>>>>>>>>>>>>>>>>>>>>>>> method in JavaRDD and then converted it back to a >>>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD after the >>>>>>>>>>>>>>>>>>>>>>>>>> operation. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> This will be inefficient (in terms of both memory >>>>>>>>>>>>>>>>>>>>>>>>>> and time) when working with very large data sets. >>>>>>>>>>>>>>>>>>>>>>>>>> But I think its important >>>>>>>>>>>>>>>>>>>>>>>>>> to have these features included. Otherwise a user >>>>>>>>>>>>>>>>>>>>>>>>>> would be left with very >>>>>>>>>>>>>>>>>>>>>>>>>> limited set of operations. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have a different >>>>>>>>>>>>>>>>>>>>>>>>>> opinion on this. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga < >>>>>>>>>>>>>>>>>>>>>>>>>> sup...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Somehow there are issues in implementing certain >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler functions due to limitations in JavaRDD >>>>>>>>>>>>>>>>>>>>>>>>>>>> used in spark >>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g. - >>>>>>>>>>>>>>>>>>>>>>>>>>>> Fill operation - when filling with values from >>>>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below >>>>>>>>>>>>>>>>>>>>>>>>>>>> Fold operation >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Agree, since rows will get executed randomly >>>>>>>>>>>>>>>>>>>>>>>>>>> with spark, inter-row operations are not very >>>>>>>>>>>>>>>>>>>>>>>>>>> meaningful. >>>>>>>>>>>>>>>>>>>>>>>>>>> But you can slightly modify the implementation >>>>>>>>>>>>>>>>>>>>>>>>>>> of the "Fill" operation, such as, to fill values >>>>>>>>>>>>>>>>>>>>>>>>>>> based on an >>>>>>>>>>>>>>>>>>>>>>>>>>> expression/static-value/mean etc. (not depending on >>>>>>>>>>>>>>>>>>>>>>>>>>> other rows).. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga >>>>>>>>>>>>>>>>>>>>>>>>>>> <sup...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry for the late reply. Have you got the >>>>>>>>>>>>>>>>>>>>>>>>>>>> details you were looking for? >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> It would be great if I could get to know which >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler operations are important for a user of >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the ML >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Other than the ones you have mentioned in the >>>>>>>>>>>>>>>>>>>>>>>>>>>> proposal, think its better to have "Translate" >>>>>>>>>>>>>>>>>>>>>>>>>>>> operation as well (to create a new column based on >>>>>>>>>>>>>>>>>>>>>>>>>>>> an existing column). >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Jun 4, 2015 at 10:11 PM, Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>> Eranjith <hmdanu...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am currently working on generating spark >>>>>>>>>>>>>>>>>>>>>>>>>>>>> transformations related to the operations >>>>>>>>>>>>>>>>>>>>>>>>>>>>> available in the data wrangler. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data wrangler provides sufficient parameters >>>>>>>>>>>>>>>>>>>>>>>>>>>>> to re-create these at spark.I have successfully >>>>>>>>>>>>>>>>>>>>>>>>>>>>> implemented delete and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> split operations of wrangler in spark. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Once this phase is completed, I can either >>>>>>>>>>>>>>>>>>>>>>>>>>>>> directly generate these scripts at wrangler or >>>>>>>>>>>>>>>>>>>>>>>>>>>>> use the javascript output >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and convert it to spark depending on the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> implementation. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Somehow there are issues in implementing >>>>>>>>>>>>>>>>>>>>>>>>>>>>> certain wrangler functions due to limitations in >>>>>>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD used in spark >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g. - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fill operation - when filling with values from >>>>>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fold operation >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> It would be great if I could get to know which >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler operations are important for a user of >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the ML >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 3, 2015 at 8:30 AM, Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fernando <nir...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please send an update of your work thus far. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, May 10, 2015 at 2:30 PM, Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fernando <nir...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Welcome to GSoC 15' ! Can you do some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> research on directly generating spark >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> transformations using Wrangler and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> come up with a summary ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 8, 2015 at 11:03 AM, Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eranjith <hmdanu...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for selecting my proposal [1] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for GSoC 2015. I am really looking forward to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> work with you all and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contribute to WSO2. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have already completed my primary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> research on wrangler and would like to meet >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you to get feedback on the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> proposed architecture. I am planning to start >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> working on the project before >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 25th of May. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] - >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Team, WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Team, WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, >>>>>>>>>>>>>>>>>>>>>>>>> WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 >>>>>>>>>>>>>>>>>>>> Inc. >>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 >>>>>>>>>>>>>>>>>> Inc. >>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>> >>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>> Nirmal >>>>>>>>>>>> >>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Thanks & regards, >>>>>>>>>> Nirmal >>>>>>>>>> >>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>> Mobile: +94715779733 >>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Thanks & regards, >>>>>>>>> Nirmal >>>>>>>>> >>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>> Mobile: +94715779733 >>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks & regards, >>>>>>> Nirmal >>>>>>> >>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>> Mobile: +94715779733 >>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks & regards, >>>>> Nirmal >>>>> >>>>> Team Lead - WSO2 Machine Learner >>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>> Mobile: +94715779733 >>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> Thanks & regards, >>> Nirmal >>> >>> Team Lead - WSO2 Machine Learner >>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>> Mobile: +94715779733 >>> Blog: http://nirmalfdo.blogspot.com/ >>> >>> >>> >> >> >> -- >> *Supun Sethunga* >> Software Engineer >> WSO2, Inc. >> http://wso2.com/ >> lean | enterprise | middleware >> Mobile : +94 716546324 >> > > > > -- > *Supun Sethunga* > Software Engineer > WSO2, Inc. > http://wso2.com/ > lean | enterprise | middleware > Mobile : +94 716546324 >
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev