Hi Danula, Please send an update at least every week.
On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga <sup...@wso2.com> wrote: > Hi Danula, > > Any update on the progress? Were you managed to integrate the > transformations with the wrangler? > > Thanks, > > On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith <hmdanu...@gmail.com> > wrote: > >> Hi all, >> >> Update on the current progress of the project and future activities as we >> discussed at the recent meeting. >> >> *Current Progress* >> >> I have completed the phase of creating spark transformations relevant to >> operations available in wrangler. >> >> Operations implemented >> - Fill >> - Split >> - Drop >> - Delete >> - Extract >> >> *Future activities* >> >> - Modify the wrangler interface to suit the current implementation >> - Automate the process of generating Spark transformations >> - Integrating wrangler to the ML workflow >> >> Thanks, >> Danula >> >> On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith <hmdanu...@gmail.com> >> wrote: >> >>> Hi all, >>> >>> No, We haven't done a review yet. >>> It would be great if we could have one so that I can discuss with you >>> all and clarify the next steps of the implementation as you mentioned. >>> >>> Thanks >>> Danula >>> >>> On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga <sup...@wso2.com> wrote: >>> >>>> Hi Danula, >>>> >>>> Did we have a review for the work done so far? If not, shall we have a >>>> one? We can clear out any doubts and issues as well.. >>>> >>>> Thanks, >>>> Supun >>>> >>>> On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando <nir...@wso2.com> >>>> wrote: >>>> >>>>> Hi Danula, >>>>> >>>>> Thanks for the update, keep them coming. >>>>> >>>>> On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes, >>>>> this is costly, since it would load whole dataset into memory. So, is this >>>>> an operation which involves multiple rows? >>>>> >>>>> On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith <hmdanu...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Supun, >>>>>> >>>>>> I modified the "Fill" operation to add what you mentioned. >>>>>> >>>>>> I used a workaround to to implement certain parts of the operations >>>>>> such as filling with values from rows above and below. >>>>>> I created a List Implementation using toArray() method in JavaRDD and >>>>>> then converted it back to a JavaRDD after the operation. >>>>>> >>>>>> This will be inefficient (in terms of both memory and time) when >>>>>> working with very large data sets. But I think its important to have >>>>>> these >>>>>> features included. Otherwise a user would be left with very limited set >>>>>> of >>>>>> operations. >>>>>> >>>>>> Please let me know if you have a different opinion on this. >>>>>> >>>>>> Thanks, >>>>>> Danula >>>>>> >>>>>> On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga <sup...@wso2.com> >>>>>> wrote: >>>>>> >>>>>>> Somehow there are issues in implementing certain wrangler functions >>>>>>>> due to limitations in JavaRDD used in spark >>>>>>>> e.g. - >>>>>>>> Fill operation - when filling with values from rows above and below >>>>>>>> Fold operation >>>>>>> >>>>>>> >>>>>>> Agree, since rows will get executed randomly with spark, inter-row >>>>>>> operations are not very meaningful. >>>>>>> But you can slightly modify the implementation of the "Fill" >>>>>>> operation, such as, to fill values based on an >>>>>>> expression/static-value/mean >>>>>>> etc. (not depending on other rows).. >>>>>>> >>>>>>> Thanks, >>>>>>> Supun >>>>>>> >>>>>>> On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga <sup...@wso2.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Danula, >>>>>>>> >>>>>>>> Sorry for the late reply. Have you got the details you were looking >>>>>>>> for? >>>>>>>> >>>>>>>> It would be great if I could get to know which wrangler operations >>>>>>>>> are important for a user of the ML >>>>>>>> >>>>>>>> >>>>>>>> Other than the ones you have mentioned in the proposal, think its >>>>>>>> better to have "Translate" operation as well (to create a new >>>>>>>> column based on an existing column). >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Supun >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith < >>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I am currently working on generating spark transformations related >>>>>>>>> to the operations available in the data wrangler. >>>>>>>>> >>>>>>>>> Data wrangler provides sufficient parameters to re-create these at >>>>>>>>> spark.I have successfully implemented delete and split operations of >>>>>>>>> wrangler in spark. >>>>>>>>> >>>>>>>>> Once this phase is completed, I can either directly generate these >>>>>>>>> scripts at wrangler or use the javascript output and convert it to >>>>>>>>> spark >>>>>>>>> depending on the implementation. >>>>>>>>> >>>>>>>>> Somehow there are issues in implementing certain wrangler >>>>>>>>> functions due to limitations in JavaRDD used in spark >>>>>>>>> >>>>>>>>> e.g. - >>>>>>>>> Fill operation - when filling with values from rows above and below >>>>>>>>> Fold operation >>>>>>>>> >>>>>>>>> It would be great if I could get to know which wrangler operations >>>>>>>>> are important for a user of the ML >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Danula >>>>>>>>> >>>>>>>>> On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando <nir...@wso2.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Danula, >>>>>>>>>> >>>>>>>>>> Please send an update of your work thus far. >>>>>>>>>> >>>>>>>>>> On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando <nir...@wso2.com >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> Hi Danula, >>>>>>>>>>> >>>>>>>>>>> Welcome to GSoC 15' ! Can you do some research on directly >>>>>>>>>>> generating spark transformations using Wrangler and come up with a >>>>>>>>>>> summary ? >>>>>>>>>>> >>>>>>>>>>> On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith < >>>>>>>>>>> hmdanu...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for selecting my proposal [1] >>>>>>>>>>>> <https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing> >>>>>>>>>>>> for GSoC 2015. I am really looking forward to work with you all and >>>>>>>>>>>> contribute to WSO2. >>>>>>>>>>>> >>>>>>>>>>>> I have already completed my primary research on wrangler and >>>>>>>>>>>> would like to meet you to get feedback on the proposed >>>>>>>>>>>> architecture. I am >>>>>>>>>>>> planning to start working on the project before 25th of May. >>>>>>>>>>>> >>>>>>>>>>>> Thank you, >>>>>>>>>>>> Danula >>>>>>>>>>>> >>>>>>>>>>>> [1] - >>>>>>>>>>>> https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Thanks & regards, >>>>>>>>>>> Nirmal >>>>>>>>>>> >>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Thanks & regards, >>>>>>>>>> Nirmal >>>>>>>>>> >>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>> Mobile: +94715779733 >>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Supun Sethunga* >>>>>>>> Software Engineer >>>>>>>> WSO2, Inc. >>>>>>>> http://wso2.com/ >>>>>>>> lean | enterprise | middleware >>>>>>>> Mobile : +94 716546324 >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Supun Sethunga* >>>>>>> Software Engineer >>>>>>> WSO2, Inc. >>>>>>> http://wso2.com/ >>>>>>> lean | enterprise | middleware >>>>>>> Mobile : +94 716546324 >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks & regards, >>>>> Nirmal >>>>> >>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>> Mobile: +94715779733 >>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Supun Sethunga* >>>> Software Engineer >>>> WSO2, Inc. >>>> http://wso2.com/ >>>> lean | enterprise | middleware >>>> Mobile : +94 716546324 >>>> >>> >>> >> > > > -- > *Supun Sethunga* > Software Engineer > WSO2, Inc. > http://wso2.com/ > lean | enterprise | middleware > Mobile : +94 716546324 > -- Thanks & regards, Nirmal Associate Technical Lead - Data Technologies Team, WSO2 Inc. Mobile: +94715779733 Blog: http://nirmalfdo.blogspot.com/
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev