Hi Danula,

Please send an update at least every week.

On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga <sup...@wso2.com> wrote:

> Hi Danula,
>
> Any update on the progress? Were you managed to integrate the
> transformations with the wrangler?
>
> Thanks,
>
> On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith <hmdanu...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> Update on the current progress of the project and future activities as we
>> discussed at the recent meeting.
>>
>> *Current Progress*
>>
>> I have completed the phase of creating spark transformations relevant to
>> operations available in wrangler.
>>
>> Operations implemented
>> - Fill
>> - Split
>> - Drop
>> - Delete
>> - Extract
>>
>> *Future activities*
>>
>> - Modify the wrangler interface to suit the current implementation
>> - Automate the process of generating Spark transformations
>> - Integrating wrangler to the ML workflow
>>
>> Thanks,
>> Danula
>>
>> On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith <hmdanu...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> No, We haven't done a review yet.
>>> It would be great if we could have one so that I can discuss with you
>>> all and clarify the next steps of the implementation as you mentioned.
>>>
>>> Thanks
>>> Danula
>>>
>>> On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga <sup...@wso2.com> wrote:
>>>
>>>> Hi Danula,
>>>>
>>>> Did we have a review for the work done so far? If not, shall we have a
>>>> one? We can clear out any doubts and issues as well..
>>>>
>>>> Thanks,
>>>> Supun
>>>>
>>>> On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> Hi Danula,
>>>>>
>>>>> Thanks for the update, keep them coming.
>>>>>
>>>>> On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes,
>>>>> this is costly, since it would load whole dataset into memory. So, is this
>>>>> an operation which involves multiple rows?
>>>>>
>>>>> On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith <hmdanu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Supun,
>>>>>>
>>>>>> I modified the "Fill" operation to add what you mentioned.
>>>>>>
>>>>>> I used a workaround to to implement certain parts of the operations
>>>>>> such as filling with values from rows above and below.
>>>>>> I created a List Implementation using toArray() method in JavaRDD and
>>>>>> then converted it back to a JavaRDD after the operation.
>>>>>>
>>>>>> This will be inefficient (in terms of both memory and time) when
>>>>>> working with very large data sets. But I think its important to have 
>>>>>> these
>>>>>> features included. Otherwise a user would be left with very limited set 
>>>>>> of
>>>>>> operations.
>>>>>>
>>>>>> Please let me know if you have a different opinion on this.
>>>>>>
>>>>>> Thanks,
>>>>>> Danula
>>>>>>
>>>>>> On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga <sup...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Somehow there are issues in implementing certain wrangler functions
>>>>>>>> due to limitations in JavaRDD used in spark
>>>>>>>> e.g. -
>>>>>>>> Fill operation - when filling with values from rows above and below
>>>>>>>> Fold operation
>>>>>>>
>>>>>>>
>>>>>>> Agree, since rows will get executed randomly with spark, inter-row
>>>>>>> operations are not very meaningful.
>>>>>>> But you can slightly modify the implementation of the "Fill"
>>>>>>> operation, such as, to fill values based on an 
>>>>>>> expression/static-value/mean
>>>>>>> etc. (not depending on other rows)..
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Supun
>>>>>>>
>>>>>>> On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga <sup...@wso2.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Danula,
>>>>>>>>
>>>>>>>> Sorry for the late reply. Have you got the details you were looking
>>>>>>>> for?
>>>>>>>>
>>>>>>>> It would be great if I could get to know which wrangler operations
>>>>>>>>> are important for a user of the ML
>>>>>>>>
>>>>>>>>
>>>>>>>> Other than the ones you have mentioned in the proposal, think its
>>>>>>>> better to have "Translate" operation as well (to create a new
>>>>>>>> column based on an existing column).
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Supun
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith <
>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I am currently working on generating spark transformations related
>>>>>>>>> to the operations available in the data wrangler.
>>>>>>>>>
>>>>>>>>> Data wrangler provides sufficient parameters to re-create these at
>>>>>>>>> spark.I have successfully implemented delete and split operations of
>>>>>>>>> wrangler in spark.
>>>>>>>>>
>>>>>>>>> Once this phase is completed, I can either directly generate these
>>>>>>>>> scripts at wrangler or use the javascript output and convert it to 
>>>>>>>>> spark
>>>>>>>>> depending on the implementation.
>>>>>>>>>
>>>>>>>>> Somehow there are issues in implementing certain wrangler
>>>>>>>>> functions due to limitations in JavaRDD used in spark
>>>>>>>>>
>>>>>>>>> e.g. -
>>>>>>>>> Fill operation - when filling with values from rows above and below
>>>>>>>>> Fold operation
>>>>>>>>>
>>>>>>>>> It would be great if I could get to know which wrangler operations
>>>>>>>>> are important for a user of the ML
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Danula
>>>>>>>>>
>>>>>>>>> On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando <nir...@wso2.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Danula,
>>>>>>>>>>
>>>>>>>>>> Please send an update of your work thus far.
>>>>>>>>>>
>>>>>>>>>> On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando <nir...@wso2.com
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>
>>>>>>>>>>> Welcome to GSoC 15' ! Can you do some research on directly
>>>>>>>>>>> generating spark transformations using Wrangler and come up with a 
>>>>>>>>>>> summary ?
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith <
>>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for selecting my proposal [1]
>>>>>>>>>>>> <https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing>
>>>>>>>>>>>> for GSoC 2015. I am really looking forward to work with you all and
>>>>>>>>>>>> contribute to WSO2.
>>>>>>>>>>>>
>>>>>>>>>>>> I have already completed my primary research on wrangler and
>>>>>>>>>>>> would like to meet you to get feedback on the proposed 
>>>>>>>>>>>> architecture. I am
>>>>>>>>>>>> planning to start working on the project before 25th of May.
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you,
>>>>>>>>>>>> Danula
>>>>>>>>>>>>
>>>>>>>>>>>> [1] -
>>>>>>>>>>>> https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>> Nirmal
>>>>>>>>>>>
>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Thanks & regards,
>>>>>>>>>> Nirmal
>>>>>>>>>>
>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Supun Sethunga*
>>>>>>>> Software Engineer
>>>>>>>> WSO2, Inc.
>>>>>>>> http://wso2.com/
>>>>>>>> lean | enterprise | middleware
>>>>>>>> Mobile : +94 716546324
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Supun Sethunga*
>>>>>>> Software Engineer
>>>>>>> WSO2, Inc.
>>>>>>> http://wso2.com/
>>>>>>> lean | enterprise | middleware
>>>>>>> Mobile : +94 716546324
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Thanks & regards,
>>>>> Nirmal
>>>>>
>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>> Mobile: +94715779733
>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Supun Sethunga*
>>>> Software Engineer
>>>> WSO2, Inc.
>>>> http://wso2.com/
>>>> lean | enterprise | middleware
>>>> Mobile : +94 716546324
>>>>
>>>
>>>
>>
>
>
> --
> *Supun Sethunga*
> Software Engineer
> WSO2, Inc.
> http://wso2.com/
> lean | enterprise | middleware
> Mobile : +94 716546324
>



-- 

Thanks & regards,
Nirmal

Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to