Hi Supun,

Following points were discussed in the meeting

*Integration to ML*

We decided to add the wrangler interface as the first step considering the
current ML implementation.

So the steps from a users perspective would be as follows

- A sample from the dataset will be sent to wrangler interface.
- User can apply desired operations in the wrangler interface
- User can return to ML by clicking an button in the interface.
- Viewing the script will be optional for the user.
- When returned to ML, spark transformations are automatically generated
and applied to the dataset.

*Spark Transformations*

I have implemented all the wrangler transformations by extending a single
abstract class. These operations are invoked by parsing the javascript code
generated by wrangler. However since ML spark transformations are applied
all together at the end of the process, I have to persist all the
parameters and keep operations as a list which can be invoked later.

Nirmal pointed out that this could be achieved by using chain of
responsibility design pattern. I am currently changing the implementation
accordingly.

I will get back to you and Nirmal when automation process is completed to
start the integration.

Regards,
Danula

On Mon, Aug 10, 2015 at 9:29 PM, Supun Sethunga <sup...@wso2.com> wrote:

> Any update?
>
> On Fri, Aug 7, 2015 at 10:13 AM, Supun Sethunga <sup...@wso2.com> wrote:
>
>> Hi Danula,
>>
>> Sorry I couldn't join the meeting. Can you please share the
>> meeting/review notes? Also the progress on the suggestions and what is left
>> to be done in overall?
>>
>> Thanks,
>> Supun
>>
>> On Wed, Aug 5, 2015 at 3:47 AM, Nirmal Fernando <nir...@wso2.com> wrote:
>>
>>> Hi Danula,
>>>
>>> It should be a JavaRDD<String[]>, where each row represents the feature
>>> vector as a string[].
>>>
>>> On Tue, Aug 4, 2015 at 11:51 AM, Danula Eranjith <hmdanu...@gmail.com>
>>> wrote:
>>>
>>>> In other words,
>>>> What would be the preferred output type for a dataset which is
>>>> pre-processed by wrangler?
>>>> As I have observed different algorithms use different JavaRDD types as
>>>> input ( JavaRDD<String>, JavaRDD<Vector> etc )
>>>>
>>>> On Tue, Aug 4, 2015 at 11:48 AM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> Hi Danula,
>>>>>
>>>>> On Tue, Aug 4, 2015 at 11:47 AM, Danula Eranjith <hmdanu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Nirmal,
>>>>>>
>>>>>> In ML, what is the preferred way of keeping data in a single row of
>>>>>> JavaRDD?
>>>>>>
>>>>>
>>>>> I didn't quite get your question. Can you elaborate please?
>>>>>
>>>>>
>>>>>>
>>>>>> As I have figured it depends on the algorithm being used.
>>>>>>
>>>>>> Danula
>>>>>>
>>>>>> On Thu, Jul 30, 2015 at 9:14 AM, Nirmal Fernando <nir...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Danula, I'll send an invite.
>>>>>>>
>>>>>>> On Wed, Jul 29, 2015 at 10:24 PM, Danula Eranjith <
>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Nirmal,
>>>>>>>>
>>>>>>>> I am available after 1.30pm on Tuesday, Wednesday and Thursday.
>>>>>>>>
>>>>>>>> Danula
>>>>>>>>
>>>>>>>> On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando <nir...@wso2.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Danula,
>>>>>>>>>
>>>>>>>>> Can we arrange a demo/review somewhere next week? Please let me
>>>>>>>>> know few time slots.
>>>>>>>>>
>>>>>>>>> On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando <nir...@wso2.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Danula.
>>>>>>>>>>
>>>>>>>>>> On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith <
>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> You can find the source at [1]
>>>>>>>>>>> <https://github.com/danula/wso2-ml-wrangler-integration>. I
>>>>>>>>>>> have to do some refactoring when integrating to ML.
>>>>>>>>>>>
>>>>>>>>>>> [1] - https://github.com/danula/wso2-ml-wrangler-integration
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando <
>>>>>>>>>>> nir...@wso2.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks Danula. Please share the current code, if possible.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith <
>>>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have succeeded in parsing the operations from wrangler
>>>>>>>>>>>>> javascript code to spark transformations I have written. Working 
>>>>>>>>>>>>> on
>>>>>>>>>>>>> automating the process.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Last couple of steps would be changing the wrangler interface
>>>>>>>>>>>>> and integrating it into ML Wizard.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Danula
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando <
>>>>>>>>>>>>> nir...@wso2.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you please summarize the current status of the project
>>>>>>>>>>>>>> and also the things left to do?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith <
>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>>>> Will use them. I already have some other kaggle datasets as
>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>    1.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Jul 19, 2015 at 11:30 PM, Danula Eranjith <
>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Nirmal,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Would it be possible to get some sample data sets which
>>>>>>>>>>>>>>>>> are more likely to be pre-processed using wrangler. I am 
>>>>>>>>>>>>>>>>> currently testing
>>>>>>>>>>>>>>>>> my implementations against small and more general data sets.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have checked datasets available at [1]
>>>>>>>>>>>>>>>>> <https://github.com/wso2/product-ml/tree/master/modules/samples>
>>>>>>>>>>>>>>>>>  as
>>>>>>>>>>>>>>>>> well. But there is nothing much to be processed as they are 
>>>>>>>>>>>>>>>>> ready to be fed
>>>>>>>>>>>>>>>>> to ML.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>> https://github.com/wso2/product-ml/tree/master/modules/samples
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Danula
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando <
>>>>>>>>>>>>>>>>> nir...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks Danula.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith <
>>>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Sorry for not keeping you in the loop.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> After considering and experimenting with several
>>>>>>>>>>>>>>>>>>> options. I am using the javascript code generated by 
>>>>>>>>>>>>>>>>>>> wrangler to implement
>>>>>>>>>>>>>>>>>>> them using spark. I have used regular expressions to 
>>>>>>>>>>>>>>>>>>> extract the
>>>>>>>>>>>>>>>>>>> operations, parameters and values and mapped them to spark 
>>>>>>>>>>>>>>>>>>> transformations
>>>>>>>>>>>>>>>>>>> I previously developed.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The code generated by wrangler for certain functions
>>>>>>>>>>>>>>>>>>> have nested operations.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (1)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> /* Fill split3  with values from above */
>>>>>>>>>>>>>>>>>>> w.add(dw.fill().column(["split3"])
>>>>>>>>>>>>>>>>>>> .table(0)
>>>>>>>>>>>>>>>>>>> .status("active")
>>>>>>>>>>>>>>>>>>> .drop(false)
>>>>>>>>>>>>>>>>>>> .direction("down")
>>>>>>>>>>>>>>>>>>> .method("copy")
>>>>>>>>>>>>>>>>>>> .row(undefined)
>>>>>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (2)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> /* Delete  rows where split1 is null */
>>>>>>>>>>>>>>>>>>> w.add(dw.filter().column([])
>>>>>>>>>>>>>>>>>>> .table(0)
>>>>>>>>>>>>>>>>>>> .status("active")
>>>>>>>>>>>>>>>>>>> .drop(false)
>>>>>>>>>>>>>>>>>>> .row(dw.row().column([])
>>>>>>>>>>>>>>>>>>> .table(0)
>>>>>>>>>>>>>>>>>>> .status("active")
>>>>>>>>>>>>>>>>>>> .drop(false)
>>>>>>>>>>>>>>>>>>> .conditions([dw.is_null().column([])
>>>>>>>>>>>>>>>>>>> .table(0)
>>>>>>>>>>>>>>>>>>> .status("active")
>>>>>>>>>>>>>>>>>>> .drop(false)
>>>>>>>>>>>>>>>>>>> .lcol("split1")
>>>>>>>>>>>>>>>>>>> .value(undefined)
>>>>>>>>>>>>>>>>>>> .op_str("is null")
>>>>>>>>>>>>>>>>>>> ])
>>>>>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have succeeded in parsing the operations similar to
>>>>>>>>>>>>>>>>>>> (1) above and currently working on extending it to work on 
>>>>>>>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>>>>> similar to (2).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Next step would be automating the process of spark
>>>>>>>>>>>>>>>>>>> transformation generation.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Danula
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando <
>>>>>>>>>>>>>>>>>>> nir...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Please send an update at least every week.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga <
>>>>>>>>>>>>>>>>>>>> sup...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Any update on the progress? Were you managed to
>>>>>>>>>>>>>>>>>>>>> integrate the transformations with the wrangler?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith <
>>>>>>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Update on the current progress of the project and
>>>>>>>>>>>>>>>>>>>>>> future activities as we discussed at the recent meeting.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> *Current Progress*
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I have completed the phase of creating spark
>>>>>>>>>>>>>>>>>>>>>> transformations relevant to operations available in 
>>>>>>>>>>>>>>>>>>>>>> wrangler.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Operations implemented
>>>>>>>>>>>>>>>>>>>>>> - Fill
>>>>>>>>>>>>>>>>>>>>>> - Split
>>>>>>>>>>>>>>>>>>>>>> - Drop
>>>>>>>>>>>>>>>>>>>>>> - Delete
>>>>>>>>>>>>>>>>>>>>>> - Extract
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> *Future activities*
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> - Modify the wrangler interface to suit the current
>>>>>>>>>>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>>>>>>>>>> - Automate the process of generating Spark
>>>>>>>>>>>>>>>>>>>>>> transformations
>>>>>>>>>>>>>>>>>>>>>> - Integrating wrangler to the ML workflow
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> Danula
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith <
>>>>>>>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> No, We haven't done a review yet.
>>>>>>>>>>>>>>>>>>>>>>> It would be great if we could have one so that I can
>>>>>>>>>>>>>>>>>>>>>>> discuss with you all and clarify the next steps of the 
>>>>>>>>>>>>>>>>>>>>>>> implementation as
>>>>>>>>>>>>>>>>>>>>>>> you mentioned.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>>> Danula
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga <
>>>>>>>>>>>>>>>>>>>>>>> sup...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Did we have a review for the work done so far? If
>>>>>>>>>>>>>>>>>>>>>>>> not, shall we have a one? We can clear out any doubts 
>>>>>>>>>>>>>>>>>>>>>>>> and issues as well..
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>> Supun
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando <
>>>>>>>>>>>>>>>>>>>>>>>> nir...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the update, keep them coming.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On a JavaRDD you can perform a collect() to get a
>>>>>>>>>>>>>>>>>>>>>>>>> list, AFAIR. Yes, this is costly, since it would load 
>>>>>>>>>>>>>>>>>>>>>>>>> whole dataset into
>>>>>>>>>>>>>>>>>>>>>>>>> memory. So, is this an operation which involves 
>>>>>>>>>>>>>>>>>>>>>>>>> multiple rows?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith <
>>>>>>>>>>>>>>>>>>>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Supun,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I modified the "Fill" operation to add what you
>>>>>>>>>>>>>>>>>>>>>>>>>> mentioned.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I used a workaround to to implement certain parts
>>>>>>>>>>>>>>>>>>>>>>>>>> of the operations such as filling with values from 
>>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below.
>>>>>>>>>>>>>>>>>>>>>>>>>> I created a List Implementation using toArray()
>>>>>>>>>>>>>>>>>>>>>>>>>> method in JavaRDD and then converted it back to a 
>>>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD after the
>>>>>>>>>>>>>>>>>>>>>>>>>> operation.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> This will be inefficient (in terms of both memory
>>>>>>>>>>>>>>>>>>>>>>>>>> and time) when working with very large data sets. 
>>>>>>>>>>>>>>>>>>>>>>>>>> But I think its important
>>>>>>>>>>>>>>>>>>>>>>>>>> to have these features included. Otherwise a user 
>>>>>>>>>>>>>>>>>>>>>>>>>> would be left with very
>>>>>>>>>>>>>>>>>>>>>>>>>> limited set of operations.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have a different
>>>>>>>>>>>>>>>>>>>>>>>>>> opinion on this.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>> Danula
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga <
>>>>>>>>>>>>>>>>>>>>>>>>>> sup...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Somehow there are issues in implementing certain
>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler functions due to limitations in JavaRDD 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> used in spark
>>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g. -
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fill operation - when filling with values from
>>>>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fold operation
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Agree, since rows will get executed randomly
>>>>>>>>>>>>>>>>>>>>>>>>>>> with spark, inter-row operations are not very 
>>>>>>>>>>>>>>>>>>>>>>>>>>> meaningful.
>>>>>>>>>>>>>>>>>>>>>>>>>>> But you can slightly modify the implementation
>>>>>>>>>>>>>>>>>>>>>>>>>>> of the "Fill" operation, such as, to fill values 
>>>>>>>>>>>>>>>>>>>>>>>>>>> based on an
>>>>>>>>>>>>>>>>>>>>>>>>>>> expression/static-value/mean etc. (not depending on 
>>>>>>>>>>>>>>>>>>>>>>>>>>> other rows)..
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Supun
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga
>>>>>>>>>>>>>>>>>>>>>>>>>>> <sup...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry for the late reply. Have you got the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> details you were looking for?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> It would be great if I could get to know which
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler operations are important for a user of 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the ML
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Other than the ones you have mentioned in the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> proposal, think its better to have "Translate"
>>>>>>>>>>>>>>>>>>>>>>>>>>>> operation as well (to create a new column based on 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> an existing column).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Supun
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Jun 4, 2015 at 10:11 PM, Danula
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eranjith <hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am currently working on generating spark
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> transformations related to the operations 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> available in the data wrangler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data wrangler provides sufficient parameters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to re-create these at spark.I have successfully 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> implemented delete and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> split operations of wrangler in spark.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Once this phase is completed, I can either
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> directly generate these scripts at wrangler or 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> use the javascript output
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and convert it to spark depending on the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Somehow there are issues in implementing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> certain wrangler functions due to limitations in 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD used in spark
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g. -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fill operation - when filling with values from
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fold operation
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It would be great if I could get to know which
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler operations are important for a user of 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the ML
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danula
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 3, 2015 at 8:30 AM, Nirmal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fernando <nir...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please send an update of your work thus far.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, May 10, 2015 at 2:30 PM, Nirmal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fernando <nir...@wso2.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Welcome to GSoC 15' ! Can you do some
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> research on directly generating spark 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> transformations using Wrangler and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> come up with a summary ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 8, 2015 at 11:03 AM, Danula
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eranjith <hmdanu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for selecting my proposal [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for GSoC 2015. I am really looking forward to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> work with you all and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contribute to WSO2.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have already completed my primary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> research on wrangler and would like to meet 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you to get feedback on the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> proposed architecture. I am planning to start 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> working on the project before
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 25th of May.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danula
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Team, WSO2 Inc.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Team, WSO2 Inc.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga*
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/
>>>>>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga*
>>>>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc.
>>>>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/
>>>>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware
>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team,
>>>>>>>>>>>>>>>>>>>>>>>>> WSO2 Inc.
>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga*
>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc.
>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/
>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware
>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga*
>>>>>>>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>>>>>>>> WSO2, Inc.
>>>>>>>>>>>>>>>>>>>>> http://wso2.com/
>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware
>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>>>>>>>>>>> Nirmal
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2
>>>>>>>>>>>>>>>>>>>> Inc.
>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>>>>>>>>> Nirmal
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2
>>>>>>>>>>>>>>>>>> Inc.
>>>>>>>>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>>>>>>> Nirmal
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>>>>> Nirmal
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks & regards,
>>>>>>>>>>>> Nirmal
>>>>>>>>>>>>
>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Thanks & regards,
>>>>>>>>>> Nirmal
>>>>>>>>>>
>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>>>>> Mobile: +94715779733
>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Thanks & regards,
>>>>>>>>> Nirmal
>>>>>>>>>
>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>>>> Mobile: +94715779733
>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Thanks & regards,
>>>>>>> Nirmal
>>>>>>>
>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>> Mobile: +94715779733
>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Thanks & regards,
>>>>> Nirmal
>>>>>
>>>>> Team Lead - WSO2 Machine Learner
>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>> Mobile: +94715779733
>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Thanks & regards,
>>> Nirmal
>>>
>>> Team Lead - WSO2 Machine Learner
>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>> Mobile: +94715779733
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>>
>>>
>>
>>
>> --
>> *Supun Sethunga*
>> Software Engineer
>> WSO2, Inc.
>> http://wso2.com/
>> lean | enterprise | middleware
>> Mobile : +94 716546324
>>
>
>
>
> --
> *Supun Sethunga*
> Software Engineer
> WSO2, Inc.
> http://wso2.com/
> lean | enterprise | middleware
> Mobile : +94 716546324
>
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to