This is the latest revision of our Quarterly Report for February, due Feb
4th.
Lee.
On Sun, Jan 19, 2020 at 10:02 PM leerho <[email protected]> wrote:
> Evans,
>
> This is good feedback and we are working on it.
> As for this last RC it was my mistake to not include the bug references.
> I have added an addendum to the vote thread with that information. I will
> also annotate the tags on github when I get a chance.
>
> Lee.
>
> On Sat, Jan 18, 2020 at 12:05 AM Evans Ye <[email protected]> wrote:
>
>>
>> | We need to have more substantive discussions on our dev@ list
>> especially about our growing TODO list and how we plan to address them.
>>
>> Is this the same idea as making disclosure of the roadmap that community
>> is working towards as well as being more transparent about what is going
>> on? If so I'm a super +1.
>>
>> I see an important thing to do is to:
>>
>> Adopt the apache way and build up an opened community culture from there.
>>
>> One thing we can do better is to firstly make code changes more
>> transparent to the public. For example, for the reason of canceling
>> dataksteches java RC1, I can't find the detail of what bug causing It and
>> what is fixed. Instead I can just look into the PR but yet don't know the
>> story which you seems to know what's going on internally.
>>
>> A concrete action we can do:
>> 1. Refer to an issue whether it's GitHub issue or JIRA and make a
>> disclosure for code changes.
>> 2. For releases, let people know what has been fixed/added. A typical way
>> to do it is via release notes[1]. HTML or text version can be generated by
>> JIRA w/ a button of click if we've issue properly tracked on it.
>>
>> Let me know if I just failed to locate what you've already done :)
>>
>> [1] https://bigtop.apache.org/release-notes.html
>>
>>
>> leerho <[email protected]> 於 2020年1月17日 週五 上午10:11寫道:
>>
>>> Thank you!
>>>
>>> Lee.
>>>
>>> On Thu, Jan 16, 2020 at 6:00 PM Dave Fisher <[email protected]>
>>> wrote:
>>>
>>>> Hi -
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Jan 16, 2020, at 5:03 PM, leerho <[email protected]> wrote:
>>>>
>>>>
>>>> I was reserving private@ for personnel type issues (e.g., new
>>>> committers, etc.). It didn't occur to me to separate dev from private by
>>>> technical work vs PMC work, since the quarterly reports are also public.
>>>> So you are saying even the QR discussion should be on private because it
>>>> is primarily for the PMC?
>>>>
>>>>
>>>> Where the public parts of the report are discussed is up to the PMC. I
>>>> see the following three cases since I’ve been on the Board through a full
>>>> quarter:
>>>>
>>>> (1) Public discussion.
>>>>
>>>> (2) Private discussion. Required if there is an open security,
>>>> personnel/personal, and trademark issue. If a podling has one of these then
>>>> we’ll need to work with the IPMC Chair since the podling report process is
>>>> public and they submit a report to the board through private channels.
>>>>
>>>> (3) Chair just submits. Podlings have no Chair and Mentors must approve.
>>>>
>>>> Since not enough is happening I suggest (1) in order to work towards
>>>> using dev@. We have a few weeks. Absolutely the best reports are those
>>>> that come from the whole community. The Board lives reports that frankly
>>>> discuss what is happening in the community: events, releases, new
>>>> contributors, development velocity or stagnation, etc. Challenges and what
>>>> the (P)PMC members are doing to address them.
>>>>
>>>> Any of the above methods will be visible to any of the 700+ Apache
>>>> Software Foundation Members Including future members. These are archived
>>>> and it’s not recorded who accesses the archives. Public mailing lists are
>>>> archived by others outside of the ASF. Someone might look in 15 years in
>>>> any of the mailing lists. I know I have looked at some project’s archives.
>>>>
>>>> Best Regards,
>>>> Dave
>>>>
>>>>
>>>> On Thu, Jan 16, 2020 at 2:24 PM Kenneth Knowles <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Jan 16, 2020 at 9:38 AM leerho <[email protected]> wrote:
>>>>>
>>>>>> Attached are modifications to the Quarterly Report (thank you
>>>>>> Dave!).
>>>>>>
>>>>>> Any more thoughts? Anyone?
>>>>>>
>>>>>
>>>>> Dave said it all. Use dev@ for dev (clearly dev work is happening)
>>>>> and private@ for PMC business (which presumes the PMC has business to
>>>>> take care of - new committers / PPMC members to consider).
>>>>>
>>>>> Kenn
>>>>>
>>>>>
>>>>>
>>>>>> Answer to Dave's question: Have any candidates for committer or PPMC
>>>>>>> membership been seen anywhere?
>>>>>>
>>>>>>
>>>>>> No, not yet.
>>>>>>
>>>>>> Lee.
>>>>>>
>>>>>> On Thu, Jan 16, 2020 at 8:21 AM Dave Fisher <[email protected]> wrote:
>>>>>>
>>>>>>> Hi -
>>>>>>>
>>>>>>> Items #1 and #2 on what to improve are really the same. A proper #2
>>>>>>> should be to have more substantive technical discussions on the
>>>>>>> dev@datasketches mailing list rather than a Google group. The
>>>>>>> mailing list should be the most prominent way to connect with the
>>>>>>> project
>>>>>>> and not the least (according to the Group page)
>>>>>>>
>>>>>>> Please include information in the report about activity levels on
>>>>>>> slack and google groups.
>>>>>>>
>>>>>>> Have any candidates for committer or PPMC membership been seen
>>>>>>> anywhere? If so then please start a discussion on private@. If not
>>>>>>> then a discussion on dev@ should proceed about what the project is
>>>>>>> looking for and how someone could earn merit.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Dave
>>>>>>>
>>>>>>> > On Jan 15, 2020, at 6:45 PM, leerho <[email protected]> wrote:
>>>>>>> >
>>>>>>> > Folks,
>>>>>>> > This is my first cut at a draft report that needs to be completed
>>>>>>> by Feb 5th End-of-day.
>>>>>>> >
>>>>>>> > Please, please -- feel free to edit, revise or add comments.
>>>>>>> >
>>>>>>> > Whimsy and other tool expect the report to formatted in a certain
>>>>>>> way so please make sure that you try to:
>>>>>>> > • Keep all lines under 76 characters long.
>>>>>>> > • All content under the ### headings should be indented by
>>>>>>> two spaces. Do not use tabs.
>>>>>>> > • Please don't change the text in the headings or add new
>>>>>>> ones.
>>>>>>> > • Include a space after a bullet point or full stop on a
>>>>>>> numbered list.
>>>>>>> > • Use [X] (X and no spaces) to sign off reports.
>>>>>>> >
>>>>>>> > It might be easier if you give me the section and text you want to
>>>>>>> edit or add to and I will merge it into a master markdown.
>>>>>>> >
>>>>>>> > Thanks!
>>>>>>> >
>>>>>>> > Lee.
>>>>>>> >
>>>>>>> > <QR_2020-02-05.md>
>>>>>>> >
>>>>>>> ---------------------------------------------------------------------
>>>>>>> > To unsubscribe, e-mail: [email protected]
>>>>>>> > For additional commands, e-mail: [email protected]
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>> For additional commands, e-mail: [email protected]
>>>>>
>>>>> --
> From my cell phone.
>
## DataSketches 19 Feb 2020
DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches are small, stateful programs that process massive data as a stream
and can provide approximate answers, with mathematical guarantees, to
computationally difficult queries orders-of-magnitude faster than
traditional, exact methods.
DataSketches has been incubating since 2019-03-30.
### Three most important unfinished issues to address before graduating:
1. Be more communicative and document our code changes more clearly.
2. We need to have more substantive discussions on dev@ especially about our growing
TODO list and how we plan to address them -- create a roadmap as a guide for
others to contribute.
3. Find / Attract new code committers outside Yahoo!
### Are there any issues that the IPMC or ASF Board need to be aware of?
No
### How has the community developed since the last report?
We are presenting at more conferences which has attracted some interest.
We are definitely getting more traffic on our forum, GitHub issues
and email lists. We recently added two channels on the-asf@slack: #datasketches
and #datasketches-dev. The traffic has been fairly low on Slack as well as
the forum. We could do more to publicize the slack channels. I could be
optimistic and believe the low traffic is due to the holidays -- or that the
code just works :)
Nonetheless, the download traffic measured by repository.a.o
has grown exponentially since our first Apache release on Sep 23. We are over 1000
unique IPs/ month and had a recent high of 22K downloads/ month. Bear in mind
that this is all traffic that has migrated from the older, pre-Apache artifacts
at com.yahoo.datasketches and is already higher than our peak downloads prior to
Apache. These numbers also do not reflect any downloads of our Zip artifacts
from a.o./dist (which includes our C++ artifacts) or other external download
repositories (for example, specific to PostgreSQL).
### How has the project developed since the last report?
Our releases are becoming easier, more polished and routine.
Nonetheless, our website needs a lot of work (as mentioned above) and this will
become our focus for the next month or so.
### How would you assess the podling's maturity?
Please feel free to add your own commentary.
- [ ] Initial setup
- [ ] Working towards first release
- [X] Community building
- [ ] Nearing graduation
- [ ] Other:
### Date of last release:
These are the major components and their last release dates:
* DataSketches-Java 2020-01-26
* DataSketches-Memory 2019-11-21
* DataSketches-CPP 2019-09-17
* DataSketches-Hive 2019-10-11
* DataSketches-Pig 2019-10-18
* DataSketches-Postgresql 2019-10-29
### When were the last committers or PPMC members elected?
No new committers since April, 2019.
### Have your mentors been helpful and responsive?
Yes.
Are things falling through the cracks? If so, please list any
open issues that need to be addressed.
No open issues other than our website troubles :(
### Is the PPMC managing the podling's brand / trademarks?
To the best of our knowledge, yes.
* Are 3rd parties respecting and correctly using the podlings name and brand?
As far as we know, yes.
* If not what actions has the PPMC taken to correct this?
We have not had to face this issue yet.
* Has the VP, Brand approved the project name?
Yes, and it is clearly stated as such on
http://incubator.apache.org/projects/datasketches.html
### Signed-off-by:
- [ ] (datasketches) Liang Chen
Comments:
- [ ] (datasketches) Kenneth Knowles
Comments:
- [ ] (datasketches) Furkan Kamaci
Comments:
- [ ] (datasketches) Dave Fisher
Comments:
- [ ] (datasketches) Evans Ye
Comments:
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]