Hi Supun,

I modified the "Fill" operation to add what you mentioned.

I used a workaround to to implement certain parts of the operations such
as filling with values from rows above and below.
I created a List Implementation using toArray() method in JavaRDD and then
converted it back to a JavaRDD after the operation.

This will be inefficient (in terms of both memory and time) when working
with very large data sets. But I think its important to have these features
included. Otherwise a user would be left with very limited set of
operations.

Please let me know if you have a different opinion on this.

Thanks,
Danula

On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga <sup...@wso2.com> wrote:

> Somehow there are issues in implementing certain wrangler functions due to
>> limitations in JavaRDD used in spark
>> e.g. -
>> Fill operation - when filling with values from rows above and below
>> Fold operation
>
>
> Agree, since rows will get executed randomly with spark, inter-row
> operations are not very meaningful.
> But you can slightly modify the implementation of the "Fill" operation,
> such as, to fill values based on an expression/static-value/mean etc. (not
> depending on other rows)..
>
> Thanks,
> Supun
>
> On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga <sup...@wso2.com> wrote:
>
>> Hi Danula,
>>
>> Sorry for the late reply. Have you got the details you were looking for?
>>
>> It would be great if I could get to know which wrangler operations are
>>> important for a user of the ML
>>
>>
>> Other than the ones you have mentioned in the proposal, think its better
>> to have "Translate" operation as well (to create a new column based on
>> an existing column).
>>
>> Thanks,
>> Supun
>>
>>
>>
>> On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith <hmdanu...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I am currently working on generating spark transformations related to
>>> the operations available in the data wrangler.
>>>
>>> Data wrangler provides sufficient parameters to re-create these at
>>> spark.I have successfully implemented delete and split operations of
>>> wrangler in spark.
>>>
>>> Once this phase is completed, I can either directly generate these
>>> scripts at wrangler or use the javascript output and convert it to spark
>>> depending on the implementation.
>>>
>>> Somehow there are issues in implementing certain wrangler functions due
>>> to limitations in JavaRDD used in spark
>>>
>>> e.g. -
>>> Fill operation - when filling with values from rows above and below
>>> Fold operation
>>>
>>> It would be great if I could get to know which wrangler operations are
>>> important for a user of the ML
>>>
>>> Thanks,
>>> Danula
>>>
>>> On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando <nir...@wso2.com> wrote:
>>>
>>>> Hi Danula,
>>>>
>>>> Please send an update of your work thus far.
>>>>
>>>> On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> Hi Danula,
>>>>>
>>>>> Welcome to GSoC 15' ! Can you do some research on directly generating
>>>>> spark transformations using Wrangler and come up with a summary ?
>>>>>
>>>>> On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith <hmdanu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Thank you for selecting my proposal [1]
>>>>>> <https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing>
>>>>>> for GSoC 2015. I am really looking forward to work with you all and
>>>>>> contribute to WSO2.
>>>>>>
>>>>>> I have already completed my primary research on wrangler and would
>>>>>> like to meet you to get feedback on the proposed architecture. I am
>>>>>> planning to start working on the project before 25th of May.
>>>>>>
>>>>>> Thank you,
>>>>>> Danula
>>>>>>
>>>>>> [1] -
>>>>>> https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Thanks & regards,
>>>>> Nirmal
>>>>>
>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>> Mobile: +94715779733
>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Thanks & regards,
>>>> Nirmal
>>>>
>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>> Mobile: +94715779733
>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> *Supun Sethunga*
>> Software Engineer
>> WSO2, Inc.
>> http://wso2.com/
>> lean | enterprise | middleware
>> Mobile : +94 716546324
>>
>
>
>
> --
> *Supun Sethunga*
> Software Engineer
> WSO2, Inc.
> http://wso2.com/
> lean | enterprise | middleware
> Mobile : +94 716546324
>
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to