I have added some more tickets. Feel free to check it out.
On Wed, Nov 11, 2020 at 7:46 PM Vinoth Chandar <[email protected]> wrote: > For some reason, that link did not work. > > https://issues.apache.org/jira/issues/?jql=labels%20%3D%20gsoc2021 > > This list looks promising to me. We can wait a bit for others' feedback and > go ahead and finalize? > > On Tue, Nov 10, 2020 at 1:01 PM Raymond Xu <[email protected]> > wrote: > > > Ok updated the list (more descriptions to be added later for those tasks, > > once we finalize it) > > > > > > > https://issues.apache.org/jira/browse/HUDI-1290?jql=labels%20%3D%20gsoc2021%20and%20project%20%3D%20HUDI > > > > On Tue, Nov 10, 2020 at 11:33 AM Vinoth Chandar <[email protected]> > wrote: > > > > > Sounds good. I can make a pass as well, once you have the trimmed list. > > > > > > Thanks Raymond! > > > > > > On Mon, Nov 9, 2020 at 7:47 PM Raymond Xu <[email protected] > > > > > wrote: > > > > > > > yes, agreed to remove the refactoring tasks and reduce to a small > > number > > > of > > > > umbrella tasks and polish them for the program. > > > > > > > > On Mon, Nov 9, 2020 at 7:39 PM Vinoth Chandar <[email protected]> > > wrote: > > > > > > > > > Good discussion! > > > > > > > > > > Given this is going to be almost like a summer internship, I > suggest > > we > > > > > limit ourselves to high quality, independent projects. > > > > > We have tons of ideas, but the need to ensure that we won't be > > picking > > > > them > > > > > up ourselves, is what makes this tricky. > > > > > > > > > > Here are some ideas top of my head. (maybe we can use this thread > to > > > > > collect ideas first) > > > > > Most of these are experimental. > > > > > > > > > > - Schema inference library, that infer a schema from vast > quantities > > of > > > > > unstructured data and help us bootstrap that into Hudi > > > > > - Survey indexing techniques, and implement a subset that can speed > > up > > > > > query performance (e.g bitmaps, tree indexes) > > > > > - Apache Beam integration (there is a JIRA for this) with a Hudi IO > > > > module. > > > > > - Apache Calcite implementation for querying Hudi datasets (we can > > pick > > > > any > > > > > other popular engine also here) > > > > > - Apache Pulsar/Kinensis source in Delta Streamer > > > > > > > > > > Raymond, the current labels have a bunch of refactoring/tasks also > > > > tagged. > > > > > If you also agree, can we untag and only put up say 5 or so, bigger > > > > > projects? > > > > > > > > > > Things around refactoring etc, for e.g, would probably get done > > before > > > > > summer. > > > > > > > > > > Thanks > > > > > Vinoth > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Nov 9, 2020 at 4:58 PM Raymond Xu < > > [email protected] > > > > > > > > > wrote: > > > > > > > > > > > Hi Siva and all, > > > > > > > > > > > > On students' commitments > > > > > > > > > > > > During the 12 weeks of coding time, nothing should take > precedence > > > over > > > > > > > your project, and you should have no major distractions. > > > > > > > > > > > > > > > > > > Quote from this page > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://opensource.googleblog.com/2011/03/dos-and-donts-of-google-summer-of-code.html > > > > > > >, > > > > > > the participating students will be working on the project as a > > > > full-time > > > > > > job for 12 weeks. So roughly we could expect 40 hours work per > > week. > > > > > > > > > > > > Also agree on your points of making the experience meaningful. I > > > > suppose > > > > > at > > > > > > this stage we are still collecting all sorts of potential tasks, > > > which > > > > > can > > > > > > be filtered later. > > > > > > > > > > > > I have started looking for issues in the backlog and labeled > some. > > > > Please > > > > > > check out this JIRA filter > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/browse/HUDI-1290?jql=labels%20%3D%20gsoc2021%20and%20project%20%3D%20HUDI > > > > > > > > > > > > Note: those issues' descriptions are to be edited with more > > > > introduction > > > > > to > > > > > > be more newcomer-friendly. And there can be some new tasks > created, > > > > too. > > > > > > > > > > > > Any feedback on those labelled tasks? Also, anyone wants to bring > > in > > > > more > > > > > > ideas or tasks for this program? > > > > > > Please feel free to post JIRA issue links here so we can > > consolidate > > > > all > > > > > > and groom later. > > > > > > > > > > > > > > > > > > Thank you. > > > > > > > > > > > > Regards, > > > > > > Raymond > > > > > > > > > > > > On Fri, Nov 6, 2020 at 10:16 AM Sivabalan <[email protected]> > > > wrote: > > > > > > > > > > > > > Sorry, just another point to remember is that, this might > happen > > by > > > > > June, > > > > > > > July, aug of 2021. So the proposal assumes that the community > may > > > not > > > > > > work > > > > > > > on these until then. > > > > > > > > > > > > > > On Fri, Nov 6, 2020 at 1:03 PM Sivabalan <[email protected]> > > > wrote: > > > > > > > > > > > > > > > I am also interested and still trying to read more on what > kind > > > of > > > > > > > > projects we can propose(execution, design, documentation, > > > > > > > usability/tools, > > > > > > > > performance framework etc), how much efforts we can expect > from > > > > > > > > students (is it 10 hours per week or 20 hours per week, etc). > > One > > > > > thing > > > > > > > we > > > > > > > > should be mindful is that, we should try our best to think > how > > > best > > > > > we > > > > > > > can > > > > > > > > help students and ensure they get something meaningful out of > > > > working > > > > > > > with > > > > > > > > us and get a good sense of how open source projects work, > code > > > > > quality > > > > > > we > > > > > > > > expect etc. And not give some assorted 10 different tasks for > > > them > > > > to > > > > > > > > complete. We should try to have standalone projects or > cohesive > > > > work > > > > > > > items > > > > > > > > (like devX may be). > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 6, 2020 at 12:50 PM Raymond Xu < > > > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > >> I'm interested in being a mentor and would like to create > and > > > > submit > > > > > > > some > > > > > > > >> issues. (Sorry wanted to raise this earlier) > > > > > > > >> > > > > > > > >> To recap it for all: > > > > > > > >> There will be college students applying and, once accepted, > > > > working > > > > > on > > > > > > > >> some > > > > > > > >> JIRA issues of Apache projects in summer 2021. We are to > > create > > > > > issues > > > > > > > and > > > > > > > >> label them for this program. Those tasks will be assigned to > > > > > > > participants > > > > > > > >> and worked on around June 2021. > > > > > > > >> > > > > > > > >> To list some of the possible areas at high level > > > > > > > >> - DevX related: code style fix and alignment, nightly build > > > setup, > > > > > > > config > > > > > > > >> docs auto-generation > > > > > > > >> - New features: new indexing schemes, SQL querying of > metadata > > > > > > > >> - Utilities improvements: new delta streamer sources, a UI, > > > > > > integrations > > > > > > > >> with other system e.g Airflow operator/sensor to trigger > > > pipelines > > > > > > based > > > > > > > >> on > > > > > > > >> Hudi commits > > > > > > > >> > > > > > > > >> > > > > > > > >> On Fri, Nov 6, 2020 at 9:32 AM Vinoth Chandar < > > > [email protected]> > > > > > > > wrote: > > > > > > > >> > > > > > > > >> > Hi all, > > > > > > > >> > > > > > > > > >> > Any one interested in putting up some projects? > > > > > > > >> > > > > > > > > >> > Thanks > > > > > > > >> > Vinoth > > > > > > > >> > > > > > > > > >> > ---------- Forwarded message --------- > > > > > > > >> > From: Sally Khudairi <[email protected]> > > > > > > > >> > Date: Mon, Nov 2, 2020 at 7:52 PM > > > > > > > >> > Subject: [PMCs] Ramping up for Google Summer of Code 2021: > > > > > > invitation > > > > > > > to > > > > > > > >> > participate > > > > > > > >> > To: ASF Marketing & Publicity <[email protected]> > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > Hello PMCs --I hope you are all well. > > > > > > > >> > > > > > > > > >> > ASF Community Development (ComDev) oversees our > > participation > > > in > > > > > > > Google > > > > > > > >> > Summer of Code, for which the ASF has been a mentoring > > > > > organization > > > > > > > >> since > > > > > > > >> > the program's inception 17 years ago. > > > > > > > >> > > > > > > > > >> > ComDev is seeking individuals and PMCs interested in > > > > participating > > > > > > as > > > > > > > >> > mentors on behalf of the ASF and Apache Projects. > > > > > > > >> > > > > > > > > >> > The planning and preparation process begins now. The > ComDev > > > team > > > > > are > > > > > > > >> > collecting ideas for the Apache Project's participation in > > > GSoC > > > > > and > > > > > > > >> want to > > > > > > > >> > hear from you. > > > > > > > >> > > > > > > > > >> > Get started by reviewing the program guidelines at > > > > > > > >> > http://community.apache.org/gsoc.html and be sure to > engage > > > > your > > > > > > > >> > communities to get involved as well. Ping the ASF's GSoC > > team > > > at > > > > > > > >> > [email protected] with any questions. > > > > > > > >> > > > > > > > > >> > Good luck and have a great program! > > > > > > > >> > > > > > > > > >> > Best, > > > > > > > >> > Sally > > > > > > > >> > > > > > > > > >> > - - - > > > > > > > >> > Vice President Marketing & Publicity > > > > > > > >> > Vice President Sponsor Relations > > > > > > > >> > The Apache Software Foundation > > > > > > > >> > > > > > > > > >> > Tel +1 617 921 8656 | [email protected] > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Regards, > > > > > > > > -Sivabalan > > > > > > > > > > > > > > > -- > > > > > > > Regards, > > > > > > > -Sivabalan > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Regards, -Sivabalan
