Hi,

Attendees:

Dmitriy Ryaboy
Alan Gates
Ashutosh Chauhan
Daniel Dai
Xuefu Zhang
Richard Ding
Olga Natkovich

Topics discussed:


(1)    Improving Pig testing:

a.       Short term

                                                               i.      making 
tests run significantly faster. Dmitriy said he would work on transitioning the 
tests into local mode. Hopefully that will reduce the run time from 10 hours to 
about 3.

                                                             ii.      Get test 
patch automation back on. I took an action item to follow up on this.

b.      Longer term

                                                               i.      Move 
beyond unit testing. Alan suggested that's once recently open sourced e2e 
harness is ready to be used (3-6 month) we would move most of e2e tests we 
currently run as unit tests into the e2e tests and only leave true unit tests 
in JUnit. This will reduce unit test runtime to something under an hour and 
will allow to run the e2e tests on real data and real clusters making the 
testing more realistic.

                                                             ii.      Figuring 
out a way to make UDF testing easier. I don't think we had many good ideas on 
how to do this. Needs further discussion

(2)    Discussion on release management. Main goal is to maintain stability for 
production systems while allowing changes to be released quickly. We came up 
with the following proposal:

a.       Making major releases time (not feature) based and release every 3 
month

b.      Make sure that branches post release are kept stable by only allowing 
P1 changes (failures with no reasonable workaround or silent failures)

c.       Develop disruptive features (example - parser changes) on separate 
branches and only folding them in once the code was completed and stabilized.

(3)    Discussion on revamping UDF interface

a.       Making interface simpler - no need to implement 3 different version

b.      Making it more intuitive

                                                               i.      No need 
for wrapping input parameters into tuples

                                                             ii.      No need 
for parameters casting

                                                            iii.      Simplify 
schema management

                                                           iv.      Simplify 
overloading

c.       This will need to coexist with the current approach for a significant 
amount of time (6-12 month) to let users transition.

(4)    Status of Piggybank

a.       Not much progress so far. Dmitriy is struggling with the build process.


Other attendees - please, feel free to add.

Olga




Reply via email to