Re: [EXTERNAL] Re: Proposal to Revive Apache Livy Community

2022-10-25 Thread larry mccay
Hi Renat -

More active contributors/committers would be great!
Let's get the names before I craft the NOTICE to add new members email
later today.

If that isn't realistic then we can always add more after the fact as well.
That's what is great about re/building the community in the incubator!

thanks!

--larry

On Mon, Oct 24, 2022 at 10:30 PM Renat Bekbolatov
 wrote:

> I welcome this development - honestly, possibly long overdue.
> Would it be possible to propose additional committers for a wider
> representation?
>
>
> Thank you,
> Renat
>
> -Original Message-
> From: larry mccay 
> Sent: Monday, October 24, 2022 4:41 PM
> To: d...@livy.apache.org
> Subject: [EXTERNAL] Re: Proposal to Revive Apache Livy Community
>
> Again, I have no objections to this and will be happy to include you.
> Thanks for calling out!
>
> On Mon, Oct 24, 2022 at 5:38 PM Marco Gaido  wrote:
>
> > Hi all,
> >
> > I am also busy with different projects at the moment, as such I am
> > more than happy if ai can help somehow, e.g. reviewing code in parts
> > that I know, but I cannot allocate much bandwidth to it.
> >
> > Thanks,
> > Marco
> >
> > Il lun 24 ott 2022, 22:07 larry mccay  ha scritto:
> >
> > > Personally, I would welcome it.
> > > Any notes of guidance and support as we ramp up the team and make
> > > the
> > first
> > > few releases would be invaluable.
> > >
> > >
> > > On Mon, Oct 24, 2022 at 3:49 PM Alex Bozarth 
> > wrote:
> > >
> > > > I would love to stay on as a PPMC, but if my lack of availability
> > > concerns
> > > > the new PPMC then I'm willing to be emeritus.
> > > >
> > > >
> > > > Alex Bozarth
> > > > Jupyter Architect, IBM CODAIT
> > > > GitHub: ajbozarth
> > > >
> > > > On 10/24/22, 2:46 PM, "larry mccay"  wrote:
> > > >
> > > > Oh, very good, Alex!
> > > > Be sure to let us know if you want to remain on the project in
> > > > any capacity.
> > > >
> > > >
> > > > On Mon, Oct 24, 2022 at 3:18 PM Alex Bozarth
> > > > 
> > > > wrote:
> > > >
> > > > > As the only Livy PPMC still responding to the Mentors on the
> > > private
> > > > list,
> > > > > I have updated the Livy Podling report for November with the
> > status
> > > > of the
> > > > > project and a request to the IPMC to review this proposal
> > > > since there are
> > > > > not enough active Livy PPMC members to reach a quorum to
> > > > pass the proposal.
> > > > >
> > > > > As a current Livy PPMC I strongly support this proposal for
> > > > revitalization
> > > > > as I do not have enough bandwidth to dedicate to Livy.
> > > > >
> > > > >
> > > > > Alex Bozarth
> > > > > Jupyter Architect, IBM CODAIT
> > > > > GitHub: ajbozarth
> > > > >
> > > > > On 10/24/22, 1:41 PM, "larry mccay"  wrote:
> > > > >
> > > > > Gentle reminder that we need to determine next steps here.
> > > > > We have an updated proposal on this thread.
> > > > > Do we need a VOTE or can we move forward directly to
> > adjusting
> > > > the
> > > > > members,
> > > > > etc?
> > > > >
> > > > > Thanks!
> > > > >
> > > > > --larry
> > > > >
> > > > > On Thu, Oct 20, 2022 at 3:26 PM larry mccay <
> > lmc...@apache.org
> > > >
> > > > wrote:
> > > > >
> > > > > > @Justin Mclean  - any insights on
> > next
> > > > steps
> > > > > here?
> > > > > >
> > > > > >
> > > > > > On Tue, Oct 18, 2022 at 5:44 PM larry mccay <
> > > lmc...@apache.org
> > > > >
> > > > > wrote:
> > > > > >
> > > > > >> Very good, here is the latest revision with updated
> > Mentors.
> > > > > >> Sunil and I have been added to the IPMC as well.
> > > > > >> Welcome Madhawa a

Re: Proposal to Revive Apache Livy Community

2022-10-24 Thread larry mccay
Great, I'll move that forward.
That sounds like a NOTICE thread?


On Mon, Oct 24, 2022 at 4:16 PM Justin Mclean 
wrote:

> Hi,
>
> Usually, the PPMC would need to vote on adding new PPMC members, but it
> seems Livy no longer has 3 active PPMC members. The chair of the project
> can add PMC members. But PPMC members only have standing in the Incubator,
> so I would guess (as it isn’t documented) that any IPMC could add those new
> PPMC members.
>
> A VOTE is probably not required but is helpful to confirm consensus. I
> would write an email to the general list containing who will be added to
> the Livy PPMC. iI there are no objections by any IPMC members after 72
> hours add them.
>
> Kind Regards,
> Justin
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: Proposal to Revive Apache Livy Community

2022-10-24 Thread Justin Mclean
Hi,

Usually, the PPMC would need to vote on adding new PPMC members, but it seems 
Livy no longer has 3 active PPMC members. The chair of the project can add PMC 
members. But PPMC members only have standing in the Incubator, so I would guess 
(as it isn’t documented) that any IPMC could add those new PPMC members.

A VOTE is probably not required but is helpful to confirm consensus. I would 
write an email to the general list containing who will be added to the Livy 
PPMC. iI there are no objections by any IPMC members after 72 hours add them.

Kind Regards,
Justin
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal to Revive Apache Livy Community

2022-10-24 Thread larry mccay
Oh, very good, Alex!
Be sure to let us know if you want to remain on the project in any capacity.


On Mon, Oct 24, 2022 at 3:18 PM Alex Bozarth  wrote:

> As the only Livy PPMC still responding to the Mentors on the private list,
> I have updated the Livy Podling report for November with the status of the
> project and a request to the IPMC to review this proposal since there are
> not enough active Livy PPMC members to reach a quorum to pass the proposal.
>
> As a current Livy PPMC I strongly support this proposal for revitalization
> as I do not have enough bandwidth to dedicate to Livy.
>
>
> Alex Bozarth
> Jupyter Architect, IBM CODAIT
> GitHub: ajbozarth
>
> On 10/24/22, 1:41 PM, "larry mccay"  wrote:
>
> Gentle reminder that we need to determine next steps here.
> We have an updated proposal on this thread.
> Do we need a VOTE or can we move forward directly to adjusting the
> members,
> etc?
>
> Thanks!
>
> --larry
>
> On Thu, Oct 20, 2022 at 3:26 PM larry mccay  wrote:
>
> > @Justin Mclean  - any insights on next steps
> here?
> >
> >
> > On Tue, Oct 18, 2022 at 5:44 PM larry mccay 
> wrote:
> >
> >> Very good, here is the latest revision with updated Mentors.
> >> Sunil and I have been added to the IPMC as well.
> >> Welcome Madhawa and thanks for stepping up as a Mentor for Livy!
> >>
> >> Abstract
> >>
> >> Livy is a web service that exposes a REST interface for managing
> long
> >> running Apache Spark contexts in your cluster. With Livy, new
> applications
> >> can be built on top of Apache Spark that require fine grained
> interaction
> >> with many Spark contexts [1].
> >>
> >> While this project has been well regarded and used in many contexts
> as
> >> the defacto standard API to Spark environments, it has been
> incubating for
> >> over 5 years without graduation to a TLP and it has become
> difficult to
> >> impossible for fixes and improvements to be contributed as the
> current
> >> community seems to have moved on.
> >>
> >> There has been discussion regarding retirement of this podling where
> >> there seems to be some increasing interest in joining and reviving
> the
> >> community [2].
> >>
> >> The intent of this proposal is to avoid retiring a well regarded,
> >> actively used and rather mature project by reviving the PPMC and
> community
> >> with new folks that have a vested interest in the project and
> health of the
> >> community.
> >> Proposal
> >>
> >> We propose to revive the PPMC with a set of contributors and
> maintainers
> >> as mentors, PPMC members and committers.
> >>
> >> The retirement DISCUSS thread [2] has shown a growing interest in
> >> providing new committers and bringing improvements and fixes from
> >> organization’s internally maintained forks back to a revived
> community.
> >>
> >> General Approach to Revival:
> >>
> >>-
> >>
> >>Add new Mentors
> >>-
> >>
> >>   Larry McCay, lmc...@apache.org , Cloudera
> >>   -
> >>
> >>   Sunil Govindan, sun...@apache.org, Cloudera
> >>   -
> >>
> >>   Jean-Baptiste Onofré, jbono...@apache.org, Talend
> >>   -
> >>
> >>   Madhawa Gunasekara, madhaw...@gmail.com, Independent
> >>
> >>
> >>
> >>-
> >>
> >>Add new Committers/PPMC
> >>-
> >>
> >>   Larry McCay, lmc...@apache.org, Cloudera
> >>   -
> >>
> >>   Vinod Kumar Vavilapalli, vino...@cloudera.com, Cloudera
> >>   -
> >>
> >>   Imran Rashid - iras...@apache.org, Cloudera
> >>   -
> >>
> >>   Gyorgy Gal, ggal ,gal.gyo...@gmail.com, Cloudera
> >>   -
> >>
> >>   Wing Yew Poon, wyp...@cloudera.com, Cloudera
> >>   -
> >>
> >>   Xilang Yan, xilang@gmail.com, Shopee
> >>   -
> >>
> >>   Jianzhen Wu, myjianz...@gmail.com, Shopee
> >>   -
> >>
> >>   Nagella Jagadeewara Rao, jnage...@visa.com, Visa
> >>   -
> >>
> >>   Pralab Kumar, pralk...@visa.com, Visa
> >>   -
> >>
> >>   Prasad Shrikant, shrikant@gmail.com, Visa
> >>   -
> >>
> >>   Brahma Reddy Battula, bra...@apache.org, Visa
> >>
> >>
> >>
> >>-
> >>
> >>Invite existing PPMC members to opt-in or otherwise go emeritus
> >>-
> >>
> >>   Jean-Baptiste Onofré, jbono...@apache.org, Talend (opted-in
> via
> >>   Retirement DISCUSS thread [2])
> >>
> >>
> >>
> >>-
> >>
> >>Invite existing Committers to opt-out or otherwise continue
> >>
> >>
> >>
> >>-
> >>
> >>Establish Roadmap via follow up DISCUSS thread
> >>-
> >>
> >>   Known 

RE: Proposal to Revive Apache Livy Community

2022-10-24 Thread Alex Bozarth
As the only Livy PPMC still responding to the Mentors on the private list, I 
have updated the Livy Podling report for November with the status of the 
project and a request to the IPMC to review this proposal since there are not 
enough active Livy PPMC members to reach a quorum to pass the proposal. 

As a current Livy PPMC I strongly support this proposal for revitalization as I 
do not have enough bandwidth to dedicate to Livy. 

 
Alex Bozarth
Jupyter Architect, IBM CODAIT
GitHub: ajbozarth

On 10/24/22, 1:41 PM, "larry mccay"  wrote:

Gentle reminder that we need to determine next steps here.
We have an updated proposal on this thread.
Do we need a VOTE or can we move forward directly to adjusting the members,
etc?

Thanks!

--larry

On Thu, Oct 20, 2022 at 3:26 PM larry mccay  wrote:

> @Justin Mclean  - any insights on next steps here?
>
>
> On Tue, Oct 18, 2022 at 5:44 PM larry mccay  wrote:
>
>> Very good, here is the latest revision with updated Mentors.
>> Sunil and I have been added to the IPMC as well.
>> Welcome Madhawa and thanks for stepping up as a Mentor for Livy!
>>
>> Abstract
>>
>> Livy is a web service that exposes a REST interface for managing long
>> running Apache Spark contexts in your cluster. With Livy, new 
applications
>> can be built on top of Apache Spark that require fine grained interaction
>> with many Spark contexts [1].
>>
>> While this project has been well regarded and used in many contexts as
>> the defacto standard API to Spark environments, it has been incubating 
for
>> over 5 years without graduation to a TLP and it has become difficult to
>> impossible for fixes and improvements to be contributed as the current
>> community seems to have moved on.
>>
>> There has been discussion regarding retirement of this podling where
>> there seems to be some increasing interest in joining and reviving the
>> community [2].
>>
>> The intent of this proposal is to avoid retiring a well regarded,
>> actively used and rather mature project by reviving the PPMC and 
community
>> with new folks that have a vested interest in the project and health of 
the
>> community.
>> Proposal
>>
>> We propose to revive the PPMC with a set of contributors and maintainers
>> as mentors, PPMC members and committers.
>>
>> The retirement DISCUSS thread [2] has shown a growing interest in
>> providing new committers and bringing improvements and fixes from
>> organization’s internally maintained forks back to a revived community.
>>
>> General Approach to Revival:
>>
>>-
>>
>>Add new Mentors
>>-
>>
>>   Larry McCay, lmc...@apache.org , Cloudera
>>   -
>>
>>   Sunil Govindan, sun...@apache.org, Cloudera
>>   -
>>
>>   Jean-Baptiste Onofré, jbono...@apache.org, Talend
>>   -
>>
>>   Madhawa Gunasekara, madhaw...@gmail.com, Independent
>>
>>
>>
>>-
>>
>>Add new Committers/PPMC
>>-
>>
>>   Larry McCay, lmc...@apache.org, Cloudera
>>   -
>>
>>   Vinod Kumar Vavilapalli, vino...@cloudera.com, Cloudera
>>   -
>>
>>   Imran Rashid - iras...@apache.org, Cloudera
>>   -
>>
>>   Gyorgy Gal, ggal ,gal.gyo...@gmail.com, Cloudera
>>   -
>>
>>   Wing Yew Poon, wyp...@cloudera.com, Cloudera
>>   -
>>
>>   Xilang Yan, xilang@gmail.com, Shopee
>>   -
>>
>>   Jianzhen Wu, myjianz...@gmail.com, Shopee
>>   -
>>
>>   Nagella Jagadeewara Rao, jnage...@visa.com, Visa
>>   -
>>
>>   Pralab Kumar, pralk...@visa.com, Visa
>>   -
>>
>>   Prasad Shrikant, shrikant@gmail.com, Visa
>>   -
>>
>>   Brahma Reddy Battula, bra...@apache.org, Visa
>>
>>
>>
>>-
>>
>>Invite existing PPMC members to opt-in or otherwise go emeritus
>>-
>>
>>   Jean-Baptiste Onofré, jbono...@apache.org, Talend (opted-in via
>>   Retirement DISCUSS thread [2])
>>
>>
>>
>>-
>>
>>Invite existing Committers to opt-out or otherwise continue
>>
>>
>>
>>-
>>
>>Establish Roadmap via follow up DISCUSS thread
>>-
>>
>>   Known Improvements from Forks which will need proposals and
>>   discussion:
>>   -
>>
>>  Adding HA for Livy
>>  -
>>
>>  Updating security capabilities (eg. kerberos for jdbc, fixing
>>  bugs in encryption)
>>  -
>>
>>  Expanding the support for kubernetes
>>  -
>>
>>  Responding to CVEs in dependencies (eg. log4j, thrift)
>> 

Re: Proposal to Revive Apache Livy Community

2022-10-24 Thread larry mccay
Gentle reminder that we need to determine next steps here.
We have an updated proposal on this thread.
Do we need a VOTE or can we move forward directly to adjusting the members,
etc?

Thanks!

--larry

On Thu, Oct 20, 2022 at 3:26 PM larry mccay  wrote:

> @Justin Mclean  - any insights on next steps here?
>
>
> On Tue, Oct 18, 2022 at 5:44 PM larry mccay  wrote:
>
>> Very good, here is the latest revision with updated Mentors.
>> Sunil and I have been added to the IPMC as well.
>> Welcome Madhawa and thanks for stepping up as a Mentor for Livy!
>>
>> Abstract
>>
>> Livy is a web service that exposes a REST interface for managing long
>> running Apache Spark contexts in your cluster. With Livy, new applications
>> can be built on top of Apache Spark that require fine grained interaction
>> with many Spark contexts [1].
>>
>> While this project has been well regarded and used in many contexts as
>> the defacto standard API to Spark environments, it has been incubating for
>> over 5 years without graduation to a TLP and it has become difficult to
>> impossible for fixes and improvements to be contributed as the current
>> community seems to have moved on.
>>
>> There has been discussion regarding retirement of this podling where
>> there seems to be some increasing interest in joining and reviving the
>> community [2].
>>
>> The intent of this proposal is to avoid retiring a well regarded,
>> actively used and rather mature project by reviving the PPMC and community
>> with new folks that have a vested interest in the project and health of the
>> community.
>> Proposal
>>
>> We propose to revive the PPMC with a set of contributors and maintainers
>> as mentors, PPMC members and committers.
>>
>> The retirement DISCUSS thread [2] has shown a growing interest in
>> providing new committers and bringing improvements and fixes from
>> organization’s internally maintained forks back to a revived community.
>>
>> General Approach to Revival:
>>
>>-
>>
>>Add new Mentors
>>-
>>
>>   Larry McCay, lmc...@apache.org , Cloudera
>>   -
>>
>>   Sunil Govindan, sun...@apache.org, Cloudera
>>   -
>>
>>   Jean-Baptiste Onofré, jbono...@apache.org, Talend
>>   -
>>
>>   Madhawa Gunasekara, madhaw...@gmail.com, Independent
>>
>>
>>
>>-
>>
>>Add new Committers/PPMC
>>-
>>
>>   Larry McCay, lmc...@apache.org, Cloudera
>>   -
>>
>>   Vinod Kumar Vavilapalli, vino...@cloudera.com, Cloudera
>>   -
>>
>>   Imran Rashid - iras...@apache.org, Cloudera
>>   -
>>
>>   Gyorgy Gal, ggal ,gal.gyo...@gmail.com, Cloudera
>>   -
>>
>>   Wing Yew Poon, wyp...@cloudera.com, Cloudera
>>   -
>>
>>   Xilang Yan, xilang@gmail.com, Shopee
>>   -
>>
>>   Jianzhen Wu, myjianz...@gmail.com, Shopee
>>   -
>>
>>   Nagella Jagadeewara Rao, jnage...@visa.com, Visa
>>   -
>>
>>   Pralab Kumar, pralk...@visa.com, Visa
>>   -
>>
>>   Prasad Shrikant, shrikant@gmail.com, Visa
>>   -
>>
>>   Brahma Reddy Battula, bra...@apache.org, Visa
>>
>>
>>
>>-
>>
>>Invite existing PPMC members to opt-in or otherwise go emeritus
>>-
>>
>>   Jean-Baptiste Onofré, jbono...@apache.org, Talend (opted-in via
>>   Retirement DISCUSS thread [2])
>>
>>
>>
>>-
>>
>>Invite existing Committers to opt-out or otherwise continue
>>
>>
>>
>>-
>>
>>Establish Roadmap via follow up DISCUSS thread
>>-
>>
>>   Known Improvements from Forks which will need proposals and
>>   discussion:
>>   -
>>
>>  Adding HA for Livy
>>  -
>>
>>  Updating security capabilities (eg. kerberos for jdbc, fixing
>>  bugs in encryption)
>>  -
>>
>>  Expanding the support for kubernetes
>>  -
>>
>>  Responding to CVEs in dependencies (eg. log4j, thrift)
>>  -
>>
>>  Livy rest cluster - IS THIS SAME AS HA for Livy ABOVE?
>>  -
>>
>>  Support multi Spark versions
>>  -
>>
>>  Implemented a metrics system for Livy
>>  -
>>
>>  Support customize batch/interactive session lifecycle event
>>  handler, default log event with log4j, very helpful for 
>> troubleshooting
>>  -
>>
>>  Optimize log to track which session id the log message came
>>  from, also very helpful for troubleshooting
>>  -
>>
>>  Support customize Spark config optimization rules, can be used
>>  to optimize config for users’ job
>>  -
>>
>>  A set of command line tool which can be used to replace Spark’s
>>  spark-submit, pyspark, spark-sql but actually submit application in 
>> Livy
>>  -
>>
>>  We are planning to implement a JDBC state store, and allow
>>  multi Livy Thrift sessions to share one backend Spark application 
>> in the
>>  next few months.
>>  -
>>
>>   These items and others that are brought to 

Re: Proposal to Revive Apache Livy Community

2022-10-20 Thread larry mccay
@Justin Mclean  - any insights on next steps here?


On Tue, Oct 18, 2022 at 5:44 PM larry mccay  wrote:

> Very good, here is the latest revision with updated Mentors.
> Sunil and I have been added to the IPMC as well.
> Welcome Madhawa and thanks for stepping up as a Mentor for Livy!
>
> Abstract
>
> Livy is a web service that exposes a REST interface for managing long
> running Apache Spark contexts in your cluster. With Livy, new applications
> can be built on top of Apache Spark that require fine grained interaction
> with many Spark contexts [1].
>
> While this project has been well regarded and used in many contexts as the
> defacto standard API to Spark environments, it has been incubating for over
> 5 years without graduation to a TLP and it has become difficult to
> impossible for fixes and improvements to be contributed as the current
> community seems to have moved on.
>
> There has been discussion regarding retirement of this podling where there
> seems to be some increasing interest in joining and reviving the community
> [2].
>
> The intent of this proposal is to avoid retiring a well regarded, actively
> used and rather mature project by reviving the PPMC and community with new
> folks that have a vested interest in the project and health of the
> community.
> Proposal
>
> We propose to revive the PPMC with a set of contributors and maintainers
> as mentors, PPMC members and committers.
>
> The retirement DISCUSS thread [2] has shown a growing interest in
> providing new committers and bringing improvements and fixes from
> organization’s internally maintained forks back to a revived community.
>
> General Approach to Revival:
>
>-
>
>Add new Mentors
>-
>
>   Larry McCay, lmc...@apache.org , Cloudera
>   -
>
>   Sunil Govindan, sun...@apache.org, Cloudera
>   -
>
>   Jean-Baptiste Onofré, jbono...@apache.org, Talend
>   -
>
>   Madhawa Gunasekara, madhaw...@gmail.com, Independent
>
>
>
>-
>
>Add new Committers/PPMC
>-
>
>   Larry McCay, lmc...@apache.org, Cloudera
>   -
>
>   Vinod Kumar Vavilapalli, vino...@cloudera.com, Cloudera
>   -
>
>   Imran Rashid - iras...@apache.org, Cloudera
>   -
>
>   Gyorgy Gal, ggal ,gal.gyo...@gmail.com, Cloudera
>   -
>
>   Wing Yew Poon, wyp...@cloudera.com, Cloudera
>   -
>
>   Xilang Yan, xilang@gmail.com, Shopee
>   -
>
>   Jianzhen Wu, myjianz...@gmail.com, Shopee
>   -
>
>   Nagella Jagadeewara Rao, jnage...@visa.com, Visa
>   -
>
>   Pralab Kumar, pralk...@visa.com, Visa
>   -
>
>   Prasad Shrikant, shrikant@gmail.com, Visa
>   -
>
>   Brahma Reddy Battula, bra...@apache.org, Visa
>
>
>
>-
>
>Invite existing PPMC members to opt-in or otherwise go emeritus
>-
>
>   Jean-Baptiste Onofré, jbono...@apache.org, Talend (opted-in via
>   Retirement DISCUSS thread [2])
>
>
>
>-
>
>Invite existing Committers to opt-out or otherwise continue
>
>
>
>-
>
>Establish Roadmap via follow up DISCUSS thread
>-
>
>   Known Improvements from Forks which will need proposals and
>   discussion:
>   -
>
>  Adding HA for Livy
>  -
>
>  Updating security capabilities (eg. kerberos for jdbc, fixing
>  bugs in encryption)
>  -
>
>  Expanding the support for kubernetes
>  -
>
>  Responding to CVEs in dependencies (eg. log4j, thrift)
>  -
>
>  Livy rest cluster - IS THIS SAME AS HA for Livy ABOVE?
>  -
>
>  Support multi Spark versions
>  -
>
>  Implemented a metrics system for Livy
>  -
>
>  Support customize batch/interactive session lifecycle event
>  handler, default log event with log4j, very helpful for 
> troubleshooting
>  -
>
>  Optimize log to track which session id the log message came
>  from, also very helpful for troubleshooting
>  -
>
>  Support customize Spark config optimization rules, can be used
>  to optimize config for users’ job
>  -
>
>  A set of command line tool which can be used to replace Spark’s
>  spark-submit, pyspark, spark-sql but actually submit application in 
> Livy
>  -
>
>  We are planning to implement a JDBC state store, and allow multi
>  Livy Thrift sessions to share one backend Spark application in the 
> next few
>  months.
>  -
>
>   These items and others that are brought to community may need
>   consolidation or multiple configurable options and will need to be part 
> of
>   the discussion
>   -
>
>  One-pager Livy Improvement Proposals (LIP) may make sense to
>  drive these discussions and convergence
>  -
>
>  Feature Branch Strategy for large changes
>  -
>
> Large features are hard to review we will need to define a
> 

Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread larry mccay
Very good, here is the latest revision with updated Mentors.
Sunil and I have been added to the IPMC as well.
Welcome Madhawa and thanks for stepping up as a Mentor for Livy!

Abstract

Livy is a web service that exposes a REST interface for managing long
running Apache Spark contexts in your cluster. With Livy, new applications
can be built on top of Apache Spark that require fine grained interaction
with many Spark contexts [1].

While this project has been well regarded and used in many contexts as the
defacto standard API to Spark environments, it has been incubating for over
5 years without graduation to a TLP and it has become difficult to
impossible for fixes and improvements to be contributed as the current
community seems to have moved on.

There has been discussion regarding retirement of this podling where there
seems to be some increasing interest in joining and reviving the community
[2].

The intent of this proposal is to avoid retiring a well regarded, actively
used and rather mature project by reviving the PPMC and community with new
folks that have a vested interest in the project and health of the
community.
Proposal

We propose to revive the PPMC with a set of contributors and maintainers as
mentors, PPMC members and committers.

The retirement DISCUSS thread [2] has shown a growing interest in providing
new committers and bringing improvements and fixes from organization’s
internally maintained forks back to a revived community.

General Approach to Revival:

   -

   Add new Mentors
   -

  Larry McCay, lmc...@apache.org , Cloudera
  -

  Sunil Govindan, sun...@apache.org, Cloudera
  -

  Jean-Baptiste Onofré, jbono...@apache.org, Talend
  -

  Madhawa Gunasekara, madhaw...@gmail.com, Independent



   -

   Add new Committers/PPMC
   -

  Larry McCay, lmc...@apache.org, Cloudera
  -

  Vinod Kumar Vavilapalli, vino...@cloudera.com, Cloudera
  -

  Imran Rashid - iras...@apache.org, Cloudera
  -

  Gyorgy Gal, ggal ,gal.gyo...@gmail.com, Cloudera
  -

  Wing Yew Poon, wyp...@cloudera.com, Cloudera
  -

  Xilang Yan, xilang@gmail.com, Shopee
  -

  Jianzhen Wu, myjianz...@gmail.com, Shopee
  -

  Nagella Jagadeewara Rao, jnage...@visa.com, Visa
  -

  Pralab Kumar, pralk...@visa.com, Visa
  -

  Prasad Shrikant, shrikant@gmail.com, Visa
  -

  Brahma Reddy Battula, bra...@apache.org, Visa



   -

   Invite existing PPMC members to opt-in or otherwise go emeritus
   -

  Jean-Baptiste Onofré, jbono...@apache.org, Talend (opted-in via
  Retirement DISCUSS thread [2])



   -

   Invite existing Committers to opt-out or otherwise continue



   -

   Establish Roadmap via follow up DISCUSS thread
   -

  Known Improvements from Forks which will need proposals and
  discussion:
  -

 Adding HA for Livy
 -

 Updating security capabilities (eg. kerberos for jdbc, fixing bugs
 in encryption)
 -

 Expanding the support for kubernetes
 -

 Responding to CVEs in dependencies (eg. log4j, thrift)
 -

 Livy rest cluster - IS THIS SAME AS HA for Livy ABOVE?
 -

 Support multi Spark versions
 -

 Implemented a metrics system for Livy
 -

 Support customize batch/interactive session lifecycle event
 handler, default log event with log4j, very helpful for troubleshooting
 -

 Optimize log to track which session id the log message came from,
 also very helpful for troubleshooting
 -

 Support customize Spark config optimization rules, can be used to
 optimize config for users’ job
 -

 A set of command line tool which can be used to replace Spark’s
 spark-submit, pyspark, spark-sql but actually submit
application in Livy
 -

 We are planning to implement a JDBC state store, and allow multi
 Livy Thrift sessions to share one backend Spark application
in the next few
 months.
 -

  These items and others that are brought to community may need
  consolidation or multiple configurable options and will need to
be part of
  the discussion
  -

 One-pager Livy Improvement Proposals (LIP) may make sense to drive
 these discussions and convergence
 -

 Feature Branch Strategy for large changes
 -

Large features are hard to review we will need to define a
strategy
-

  Determine the Improvements to be delivered across first 3 Releases
  with Target Release Dates



   -

   Ensure CVE and Dependency management hygiene is in place


The above approach will usher the community back to an active status with a
Roadmap of 3 or more release plans and security hygiene in place.
Development Practices

The Livy project follows a review 

Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread Madhawa Gunasekara
Hi Larry,

I'm an  IPMC Member. madhawa30 at gmail dot com is my preferred email
address.
apache id:  madhawa

Thanks,
Madhawa


On Tue, Oct 18, 2022 at 10:05 PM larry mccay  wrote:

> Hi Madhawa -
>
> That's awesome!
> Are you already a member of IPMC?
> If not, are you an ASF member?
> If you are an ASF member you can request that you be added as an IPMC
> member.
>
> Can you provide your company affiliation for the proposal and preferred
> email?
>
> thanks!
>
> --larry
>
> On Tue, Oct 18, 2022 at 2:18 PM Madhawa Gunasekara 
> wrote:
>
> > Hi Larry,
> >
> > I'm interested in working with Livy and would like to join as a Mentor.
> >
> > Thanks,
> > Madhawa
> >
> >
> > On Tue, Oct 18, 2022 at 6:57 PM larry mccay  wrote:
> >
> > > Sorry, I missed commenting on this:
> > >
> > > "There is also no concept as an emeritus PPMC member at the ASF."
> > >
> > > I assume that we can remove PPMC members that do not opt-in explicitly
> at
> > > this point.
> > > They will have every opportunity to rejoin.
> > >
> > > On Tue, Oct 18, 2022 at 12:48 PM larry mccay 
> wrote:
> > >
> > > > I will ask in a separate thread, @Justin Mclean  >
> > -
> > > > thanks.
> > > > Adding JB adds another company and we are certainly open to anyone
> else
> > > > that would like to join as a mentor.
> > > > At the end of the day, the mentors are for instilling the Apache Way
> > > > knowledge and steering toward graduation.
> > > > I feel that this diversity, while nice to have, is less important
> than
> > > > that of the PPMC and committers for the long term health of the
> > > community.
> > > >
> > > > We need to push this podling to graduation as quickly as possible
> since
> > > it
> > > > is rather mature and needs to get to the next level.
> > > >
> > > > Again, any potential Mentors that would like to join are more than
> > > welcome.
> > > >
> > > > On Tue, Oct 18, 2022 at 12:38 PM Justin Mclean <
> > jus...@classsoftware.com
> > > >
> > > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I’m sorry, but Imran Rashid can’t be a mentor for the project as
> they
> > > are
> > > >> not an IPMC member. Currently, both Sunil and Larry (as they are ASF
> > > >> members) need to ask to join the IPMC and NOTICE sent to the ASF
> > board.
> > > I
> > > >> would also prefer that mentors come from different companies.
> > > >>
> > > >> There is also no concept as an emeritus PPMC member at the ASF.
> > > >>
> > > >> Kind Regards,
> > > >> Justin
> > > >>
> > > >>
> > > >>
> -
> > > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > >> For additional commands, e-mail: general-h...@incubator.apache.org
> > > >>
> > > >>
> > >
> >
>


Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread Justin Mclean
Hi,

Madhawa is an IPMC member.

Kind Regards,
Justin

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread larry mccay
Hi Madhawa -

That's awesome!
Are you already a member of IPMC?
If not, are you an ASF member?
If you are an ASF member you can request that you be added as an IPMC
member.

Can you provide your company affiliation for the proposal and preferred
email?

thanks!

--larry

On Tue, Oct 18, 2022 at 2:18 PM Madhawa Gunasekara 
wrote:

> Hi Larry,
>
> I'm interested in working with Livy and would like to join as a Mentor.
>
> Thanks,
> Madhawa
>
>
> On Tue, Oct 18, 2022 at 6:57 PM larry mccay  wrote:
>
> > Sorry, I missed commenting on this:
> >
> > "There is also no concept as an emeritus PPMC member at the ASF."
> >
> > I assume that we can remove PPMC members that do not opt-in explicitly at
> > this point.
> > They will have every opportunity to rejoin.
> >
> > On Tue, Oct 18, 2022 at 12:48 PM larry mccay  wrote:
> >
> > > I will ask in a separate thread, @Justin Mclean 
> -
> > > thanks.
> > > Adding JB adds another company and we are certainly open to anyone else
> > > that would like to join as a mentor.
> > > At the end of the day, the mentors are for instilling the Apache Way
> > > knowledge and steering toward graduation.
> > > I feel that this diversity, while nice to have, is less important than
> > > that of the PPMC and committers for the long term health of the
> > community.
> > >
> > > We need to push this podling to graduation as quickly as possible since
> > it
> > > is rather mature and needs to get to the next level.
> > >
> > > Again, any potential Mentors that would like to join are more than
> > welcome.
> > >
> > > On Tue, Oct 18, 2022 at 12:38 PM Justin Mclean <
> jus...@classsoftware.com
> > >
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> I’m sorry, but Imran Rashid can’t be a mentor for the project as they
> > are
> > >> not an IPMC member. Currently, both Sunil and Larry (as they are ASF
> > >> members) need to ask to join the IPMC and NOTICE sent to the ASF
> board.
> > I
> > >> would also prefer that mentors come from different companies.
> > >>
> > >> There is also no concept as an emeritus PPMC member at the ASF.
> > >>
> > >> Kind Regards,
> > >> Justin
> > >>
> > >>
> > >> -
> > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > >> For additional commands, e-mail: general-h...@incubator.apache.org
> > >>
> > >>
> >
>


Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread Madhawa Gunasekara
Hi Larry,

I'm interested in working with Livy and would like to join as a Mentor.

Thanks,
Madhawa


On Tue, Oct 18, 2022 at 6:57 PM larry mccay  wrote:

> Sorry, I missed commenting on this:
>
> "There is also no concept as an emeritus PPMC member at the ASF."
>
> I assume that we can remove PPMC members that do not opt-in explicitly at
> this point.
> They will have every opportunity to rejoin.
>
> On Tue, Oct 18, 2022 at 12:48 PM larry mccay  wrote:
>
> > I will ask in a separate thread, @Justin Mclean  -
> > thanks.
> > Adding JB adds another company and we are certainly open to anyone else
> > that would like to join as a mentor.
> > At the end of the day, the mentors are for instilling the Apache Way
> > knowledge and steering toward graduation.
> > I feel that this diversity, while nice to have, is less important than
> > that of the PPMC and committers for the long term health of the
> community.
> >
> > We need to push this podling to graduation as quickly as possible since
> it
> > is rather mature and needs to get to the next level.
> >
> > Again, any potential Mentors that would like to join are more than
> welcome.
> >
> > On Tue, Oct 18, 2022 at 12:38 PM Justin Mclean  >
> > wrote:
> >
> >> Hi,
> >>
> >> I’m sorry, but Imran Rashid can’t be a mentor for the project as they
> are
> >> not an IPMC member. Currently, both Sunil and Larry (as they are ASF
> >> members) need to ask to join the IPMC and NOTICE sent to the ASF board.
> I
> >> would also prefer that mentors come from different companies.
> >>
> >> There is also no concept as an emeritus PPMC member at the ASF.
> >>
> >> Kind Regards,
> >> Justin
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >> For additional commands, e-mail: general-h...@incubator.apache.org
> >>
> >>
>


Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread larry mccay
Sorry, I missed commenting on this:

"There is also no concept as an emeritus PPMC member at the ASF."

I assume that we can remove PPMC members that do not opt-in explicitly at
this point.
They will have every opportunity to rejoin.

On Tue, Oct 18, 2022 at 12:48 PM larry mccay  wrote:

> I will ask in a separate thread, @Justin Mclean  -
> thanks.
> Adding JB adds another company and we are certainly open to anyone else
> that would like to join as a mentor.
> At the end of the day, the mentors are for instilling the Apache Way
> knowledge and steering toward graduation.
> I feel that this diversity, while nice to have, is less important than
> that of the PPMC and committers for the long term health of the community.
>
> We need to push this podling to graduation as quickly as possible since it
> is rather mature and needs to get to the next level.
>
> Again, any potential Mentors that would like to join are more than welcome.
>
> On Tue, Oct 18, 2022 at 12:38 PM Justin Mclean 
> wrote:
>
>> Hi,
>>
>> I’m sorry, but Imran Rashid can’t be a mentor for the project as they are
>> not an IPMC member. Currently, both Sunil and Larry (as they are ASF
>> members) need to ask to join the IPMC and NOTICE sent to the ASF board. I
>> would also prefer that mentors come from different companies.
>>
>> There is also no concept as an emeritus PPMC member at the ASF.
>>
>> Kind Regards,
>> Justin
>>
>>
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>>
>>


Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread larry mccay
I will ask in a separate thread, @Justin Mclean  -
thanks.
Adding JB adds another company and we are certainly open to anyone else
that would like to join as a mentor.
At the end of the day, the mentors are for instilling the Apache Way
knowledge and steering toward graduation.
I feel that this diversity, while nice to have, is less important than that
of the PPMC and committers for the long term health of the community.

We need to push this podling to graduation as quickly as possible since it
is rather mature and needs to get to the next level.

Again, any potential Mentors that would like to join are more than welcome.

On Tue, Oct 18, 2022 at 12:38 PM Justin Mclean 
wrote:

> Hi,
>
> I’m sorry, but Imran Rashid can’t be a mentor for the project as they are
> not an IPMC member. Currently, both Sunil and Larry (as they are ASF
> members) need to ask to join the IPMC and NOTICE sent to the ASF board. I
> would also prefer that mentors come from different companies.
>
> There is also no concept as an emeritus PPMC member at the ASF.
>
> Kind Regards,
> Justin
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread Justin Mclean
Hi,

I’m sorry, but Imran Rashid can’t be a mentor for the project as they are not 
an IPMC member. Currently, both Sunil and Larry (as they are ASF members) need 
to ask to join the IPMC and NOTICE sent to the ASF board. I would also prefer 
that mentors come from different companies.

There is also no concept as an emeritus PPMC member at the ASF.

Kind Regards,
Justin


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal to Revive Apache Livy Community

2022-10-18 Thread larry mccay
Revised Proposal with opt-out for committers and JB as a mentor which adds
more diversity to the company based mentors.
These were items suggested on this thread. Thanks again, Justin!
Abstract

Livy is a web service that exposes a REST interface for managing long
running Apache Spark contexts in your cluster. With Livy, new applications
can be built on top of Apache Spark that require fine grained interaction
with many Spark contexts [1].

While this project has been well regarded and used in many contexts as the
defacto standard API to Spark environments, it has been incubating for over
5 years without graduation to a TLP and it has become difficult to
impossible for fixes and improvements to be contributed as the current
community seems to have moved on.

There has been discussion regarding retirement of this podling where there
seems to be some increasing interest in joining and reviving the community
[2].

The intent of this proposal is to avoid retiring a well regarded, actively
used and rather mature project by reviving the PPMC and community with new
folks that have a vested interest in the project and health of the
community.
Proposal

We propose to revive the PPMC with a set of contributors and maintainers as
mentors, PPMC members and committers.

The retirement DISCUSS thread [2] has shown a growing interest in providing
new committers and bringing improvements and fixes from organization’s
internally maintained forks back to a revived community.

General Approach to Revival:

   -

   Add new Mentors
   -

  Larry McCay, lmc...@apache.org , Cloudera
  -

  Sunil Govindan, sun...@apache.org, Cloudera
  -

  Imran Rashid - iras...@apache.org, Cloudera
  -

  Jean-Baptiste Onofré, jbono...@apache.org, Talend



   -

   Add new Committers/PPMC
   -

  Larry McCay, lmc...@apache.org, Cloudera
  -

  Vinod Kumar Vavilapalli, vino...@cloudera.com, Cloudera
  -

  Gyorgy Gal, ggal ,gal.gyo...@gmail.com, Cloudera
  -

  Wing Yew Poon, wyp...@cloudera.com, Cloudera
  -

  Xilang Yan, xilang@gmail.com, Shopee
  -

  Jianzhen Wu, myjianz...@gmail.com, Shopee
  -

  Nagella Jagadeewara Rao, jnage...@visa.com, Visa
  -

  Pralab Kumar, pralk...@visa.com, Visa
  -

  Prasad Shrikant, shrikant@gmail.com, Visa
  -

  Brahma Reddy Battula, bra...@apache.org, Visa



   -

   Invite existing PPMC members to opt-in or otherwise go emeritus
   -

  Jean-Baptiste Onofré, jbono...@apache.org, Talend (opted-in via
  Retirement DISCUSS thread [2])



   -

   Invite existing Committers to opt-out or otherwise continue



   -

   Establish Roadmap via follow up DISCUSS thread
   -

  Known Improvements from Forks which will need proposals and
  discussion:
  -

 Adding HA for Livy
 -

 Updating security capabilities (eg. kerberos for jdbc, fixing bugs
 in encryption)
 -

 Expanding the support for kubernetes
 -

 Responding to CVEs in dependencies (eg. log4j, thrift)
 -

 Livy rest cluster - IS THIS SAME AS HA for Livy ABOVE?
 -

 Support multi Spark versions
 -

 Implemented a metrics system for Livy
 -

 Support customize batch/interactive session lifecycle event
 handler, default log event with log4j, very helpful for troubleshooting
 -

 Optimize log to track which session id the log message came from,
 also very helpful for troubleshooting
 -

 Support customize Spark config optimization rules, can be used to
 optimize config for users’ job
 -

 A set of command line tool which can be used to replace Spark’s
 spark-submit, pyspark, spark-sql but actually submit
application in Livy
 -

 We are planning to implement a JDBC state store, and allow multi
 Livy Thrift sessions to share one backend Spark application
in the next few
 months.
 -

  These items and others that are brought to community may need
  consolidation or multiple configurable options and will need to
be part of
  the discussion
  -

 One-pager Livy Improvement Proposals (LIP) may make sense to drive
 these discussions and convergence
 -

 Feature Branch Strategy for large changes
 -

Large features are hard to review we will need to define a
strategy
-

CTR for feature branches and a we will define a walkthrough of
implementation details to aid in review for merge
-

  Determine the Improvements to be delivered across first 3 Releases
  with Target Release Dates



   -

   Ensure CVE and Dependency management hygiene is in place


The above approach will usher the community back to an active status with a
Roadmap of 3 or more release plans 

Re: Proposal to Revive Apache Livy Community

2022-10-16 Thread Jean-Baptiste Onofré
Hi Larry,

thanks for this !

I'm still interested to work on Livy. Can you please add me as mentor ?

Thanks !
Regards
JB

On Fri, Oct 14, 2022 at 8:38 PM larry mccay  wrote:
>
> Abstract
>
> Livy is a web service that exposes a REST interface for managing long
> running Apache Spark contexts in your cluster. With Livy, new applications
> can be built on top of Apache Spark that require fine grained interaction
> with many Spark contexts [1].
>
> While this project has been well regarded and used in many contexts as the
> defacto standard API to Spark environments, it has been incubating for over
> 5 years without graduation to a TLP and it has become difficult to
> impossible for fixes and improvements to be contributed as the current
> community seems to have moved on.
>
> There has been discussion regarding retirement of this podling where there
> seems to be some increasing interest in joining and reviving the community
> [2].
>
> The intent of this proposal is to avoid retiring a well regarded, actively
> used and rather mature project by reviving the PPMC and community with new
> folks that have a vested interest in the project and health of the
> community.
> Proposal
>
> We propose to revive the PPMC with a set of contributors and maintainers as
> mentors, PPMC members and committers.
>
> The retirement DISCUSS thread [2] has shown a growing interest in providing
> new committers and bringing improvements and fixes from organization’s
> internally maintained forks back to a revived community.
>
> General Approach to Revival:
>
>-
>
>Add new Mentors
>-
>
>   Larry McCay, lmc...@apache.org , Cloudera
>   -
>
>   Sunil Govindan, sun...@apache.org, Cloudera
>   -
>
>   Imran Rashid - iras...@apache.org, Cloudera
>
>
>
>-
>
>Add new Committers/PPMC
>-
>
>   Larry McCay, lmc...@apache.org, Cloudera
>   -
>
>   Vinod Kumar Vavilapalli, vino...@cloudera.com, Cloudera
>   -
>
>   Gyorgy Gal, ggal ,gal.gyo...@gmail.com, Cloudera
>   -
>
>   Wing Yew Poon, wyp...@cloudera.com, Cloudera
>   -
>
>   Xilang Yan, xilang@gmail.com, Shopee
>   -
>
>   Jianzhen Wu, myjianz...@gmail.com, Shopee
>   -
>
>   Nagella Jagadeewara Rao, jnage...@visa.com, Visa
>   -
>
>   Pralab Kumar, pralk...@visa.com, Visa
>   -
>
>   Prasad Shrikant, shrikant@gmail.com, Visa
>   -
>
>   Brahma Reddy Battula, bra...@apache.org, Visa
>
>
>
>-
>
>Invite existing PPMC members to opt-in or otherwise go emeritus
>-
>
>   Jean-Baptiste Onofré, jbono...@apache.org, Talend (opted-in via
>   Retirement DISCUSS thread [2])
>
>
>
>-
>
>Invite existing Committers/PPMC members to opt-in or otherwise go
>emeritus
>
>
>
>-
>
>Establish Roadmap via follow up DISCUSS thread
>-
>
>   Known Improvements from Forks which will need proposals and
>   discussion:
>   -
>
>  Adding HA for Livy
>  -
>
>  Updating security capabilities (eg. kerberos for jdbc, fixing bugs
>  in encryption)
>  -
>
>  Expanding the support for kubernetes
>  -
>
>  Responding to CVEs in dependencies (eg. log4j, thrift)
>  -
>
>  Livy rest cluster - IS THIS SAME AS HA for Livy ABOVE?
>  -
>
>  Support multi Spark versions
>  -
>
>  Implemented a metrics system for Livy
>  -
>
>  Support customize batch/interactive session lifecycle event
>  handler, default log event with log4j, very helpful for
> trouble shooting
>  -
>
>  Optimize log to track which session id the log message came from,
>  also very helpful for trouble shooting
>  -
>
>  Support customize Spark config optimization rules, can be used to
>  optimize config for users’ job
>  -
>
>  A set of command line tool which can be used to replace Spark’s
>  spark-submit, pyspark, spark-sql but actually submit
> application in Livy
>  -
>
>  We are planning to implement a JDBC state store, and allow multi
>  Livy Thrift sessions to share one backend Spark application
> in the next few
>  months.
>  -
>
>   These items and others that are brought to community may need
>   consolidation or multiple configurable options and will need to
> be part of
>   the discussion
>   -
>
>  One-pager Livy Improvement Proposals (LIP) may make sense to drive
>  these discussions and convergence
>  -
>
>  Feature Branch Strategy for large changes
>  -
>
> Large features are hard to review we will need to define a
> strategy
> -
>
>   Determine the Improvements to be delivered across first 3 Releases
>   with Target Release Dates
>
>
>
>-
>
>Ensure CVE and Dependency management hygiene is in place
>
>
> The above 

Re: Proposal to Revive Apache Livy Community

2022-10-15 Thread larry mccay
Awesome - thanks, Justin!

On Sat, Oct 15, 2022, 11:51 AM Justin Mclean 
wrote:

> Hi,
>
> The project is free to choose RTC or RTC. I just wanted to check if it was
> considered. I’ve seen in some cases, CTR tends to put a lot of work onto
> existing committers and cause frustration when contributions are not
> reviewed in a timely way.
>
> Another thing to consider is what someone would have to do to become a
> committer. A RTC with a low committer bar would probably work better than a
> RTC with a high committer bar.
>
> You are not an IPMC member but can ask to become one as you are an ASF
> member.
>
> I started a roll call on the Livy private list that may get the attention
> of any existing PMC members.
>
> Kind Regards,
> Justin
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: Proposal to Revive Apache Livy Community

2022-10-15 Thread Justin Mclean
Hi,

The project is free to choose RTC or RTC. I just wanted to check if it was 
considered. I’ve seen in some cases, CTR tends to put a lot of work onto 
existing committers and cause frustration when contributions are not reviewed 
in a timely way.

Another thing to consider is what someone would have to do to become a 
committer. A RTC with a low committer bar would probably work better than a RTC 
with a high committer bar.

You are not an IPMC member but can ask to become one as you are an ASF member.

I started a roll call on the Livy private list that may get the attention of 
any existing PMC members.

Kind Regards,
Justin


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal to Revive Apache Livy Community

2022-10-15 Thread larry mccay
Hi Justin -

Thanks for the feedback.
I can buy all of that other than the dev practices suggestion.
The project originally used RTC as per the original proposal and being a
more mature podling, I don't think we need to move that fast.
With new committers, it will also be good to ensure others are familiar
with the code going in.

For pulling in large features from forks, we can have feature branches that
are CTR and define merge criteria.

I believe that I was an IPMC member at some point and have been a mentor on
other podlings (still am???)  but maybe somehow that went away?
We would certainly welcome any other mentors as well.
JB is currently listed as a Mentor on the site - I added him as a committer.
@JB do you intend to be a Committer or Mentor going forward?

The proposal was sent to the d...@livy.incubator.org list as well in order
to get their thoughts.
There have also been public threads about retiring it where this has been
discussed as you know.
Do you have further suggestions for how to reach out and/or how long to
wait?

Thanks again!

--larry

On Sat, Oct 15, 2022 at 12:24 AM Justin Mclean 
wrote:

> Hi,
>
> A couple of things stand out here to me:
> - Your mentors need to be IPMC members.
> - It would be helpful to include mentors who are not all affiliated with
> one company
> - It would be helpful to know what the existing PMC and the wider
> community think about this reboot. It may be there’s no one left, and no
> objections from those who are still about, but we should give them some
> time to speak up.
> - RTC can slow development and, in some cases, limit contributions,
> particularly when you have a less active project. Have you considered CTR?
> - While asking existing PPMC members to opt-in is probably fine, I think
> you should leave existing committers as they are and not remove anyone
> unless they asked to be removed.
> - Existing PPMC members who don’t explicitly opt-out should be able to
> rejoin the PPMC later if they ask.
>
> Kind Regards,
> Justin
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: Proposal to Revive Apache Livy Community

2022-10-14 Thread Justin Mclean
Hi,

A couple of things stand out here to me:
- Your mentors need to be IPMC members.
- It would be helpful to include mentors who are not all affiliated with one 
company
- It would be helpful to know what the existing PMC and the wider community 
think about this reboot. It may be there’s no one left, and no objections from 
those who are still about, but we should give them some time to speak up.
- RTC can slow development and, in some cases, limit contributions, 
particularly when you have a less active project. Have you considered CTR?
- While asking existing PPMC members to opt-in is probably fine, I think you 
should leave existing committers as they are and not remove anyone unless they 
asked to be removed.
- Existing PPMC members who don’t explicitly opt-out should be able to rejoin 
the PPMC later if they ask.

Kind Regards,
Justin


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal to Revive Apache Livy Community

2022-10-14 Thread larry mccay
+d...@livy.incubator.apache.org 


On Fri, Oct 14, 2022 at 2:38 PM larry mccay  wrote:

> Abstract
>
> Livy is a web service that exposes a REST interface for managing long
> running Apache Spark contexts in your cluster. With Livy, new applications
> can be built on top of Apache Spark that require fine grained interaction
> with many Spark contexts [1].
>
> While this project has been well regarded and used in many contexts as the
> defacto standard API to Spark environments, it has been incubating for over
> 5 years without graduation to a TLP and it has become difficult to
> impossible for fixes and improvements to be contributed as the current
> community seems to have moved on.
>
> There has been discussion regarding retirement of this podling where there
> seems to be some increasing interest in joining and reviving the community
> [2].
>
> The intent of this proposal is to avoid retiring a well regarded, actively
> used and rather mature project by reviving the PPMC and community with new
> folks that have a vested interest in the project and health of the
> community.
> Proposal
>
> We propose to revive the PPMC with a set of contributors and maintainers
> as mentors, PPMC members and committers.
>
> The retirement DISCUSS thread [2] has shown a growing interest in
> providing new committers and bringing improvements and fixes from
> organization’s internally maintained forks back to a revived community.
>
> General Approach to Revival:
>
>-
>
>Add new Mentors
>-
>
>   Larry McCay, lmc...@apache.org , Cloudera
>   -
>
>   Sunil Govindan, sun...@apache.org, Cloudera
>   -
>
>   Imran Rashid - iras...@apache.org, Cloudera
>
>
>
>-
>
>Add new Committers/PPMC
>-
>
>   Larry McCay, lmc...@apache.org, Cloudera
>   -
>
>   Vinod Kumar Vavilapalli, vino...@cloudera.com, Cloudera
>   -
>
>   Gyorgy Gal, ggal ,gal.gyo...@gmail.com, Cloudera
>   -
>
>   Wing Yew Poon, wyp...@cloudera.com, Cloudera
>   -
>
>   Xilang Yan, xilang@gmail.com, Shopee
>   -
>
>   Jianzhen Wu, myjianz...@gmail.com, Shopee
>   -
>
>   Nagella Jagadeewara Rao, jnage...@visa.com, Visa
>   -
>
>   Pralab Kumar, pralk...@visa.com, Visa
>   -
>
>   Prasad Shrikant, shrikant@gmail.com, Visa
>   -
>
>   Brahma Reddy Battula, bra...@apache.org, Visa
>
>
>
>-
>
>Invite existing PPMC members to opt-in or otherwise go emeritus
>-
>
>   Jean-Baptiste Onofré, jbono...@apache.org, Talend (opted-in via
>   Retirement DISCUSS thread [2])
>
>
>
>-
>
>Invite existing Committers/PPMC members to opt-in or otherwise go
>emeritus
>
>
>
>-
>
>Establish Roadmap via follow up DISCUSS thread
>-
>
>   Known Improvements from Forks which will need proposals and
>   discussion:
>   -
>
>  Adding HA for Livy
>  -
>
>  Updating security capabilities (eg. kerberos for jdbc, fixing
>  bugs in encryption)
>  -
>
>  Expanding the support for kubernetes
>  -
>
>  Responding to CVEs in dependencies (eg. log4j, thrift)
>  -
>
>  Livy rest cluster - IS THIS SAME AS HA for Livy ABOVE?
>  -
>
>  Support multi Spark versions
>  -
>
>  Implemented a metrics system for Livy
>  -
>
>  Support customize batch/interactive session lifecycle event
>  handler, default log event with log4j, very helpful for trouble 
> shooting
>  -
>
>  Optimize log to track which session id the log message came
>  from, also very helpful for trouble shooting
>  -
>
>  Support customize Spark config optimization rules, can be used
>  to optimize config for users’ job
>  -
>
>  A set of command line tool which can be used to replace Spark’s
>  spark-submit, pyspark, spark-sql but actually submit application in 
> Livy
>  -
>
>  We are planning to implement a JDBC state store, and allow multi
>  Livy Thrift sessions to share one backend Spark application in the 
> next few
>  months.
>  -
>
>   These items and others that are brought to community may need
>   consolidation or multiple configurable options and will need to be part 
> of
>   the discussion
>   -
>
>  One-pager Livy Improvement Proposals (LIP) may make sense to
>  drive these discussions and convergence
>  -
>
>  Feature Branch Strategy for large changes
>  -
>
> Large features are hard to review we will need to define a
> strategy
> -
>
>   Determine the Improvements to be delivered across first 3 Releases
>   with Target Release Dates
>
>
>
>-
>
>Ensure CVE and Dependency management hygiene is in place
>
>
> The above approach will usher the community back to an active status with
> a Roadmap of 3 or more 

Re: Proposal

2022-05-20 Thread Bertrand Delacretaz
Hi,

Le ven. 20 mai 2022 à 00:33, m sacks  a écrit :
>
> And if i want to publish it without building community just publish to
> github if i understand correctly?...

You can do that, and if you have plans to join the ASF later it's
useful to take a look at [1] and ideally shape your emerging community
along these lines.

And also look at [2] as mentioned earlier, which is more specifically
about the ASF Incubator.

-Bertrand

[1] https://community.apache.org/apache-way/apache-project-maturity-model.html
[2] https://incubator.apache.org/cookbook/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal

2022-05-19 Thread m sacks
Cool.

On Thu, May 19, 2022 at 3:37 PM Clebert Suconic 
wrote:

> yep... pretty much
>
> On Thu, May 19, 2022 at 6:33 PM m sacks  wrote:
> >
> > And if i want to publish it without building community just publish to
> github if i understand correctly?
> >
> >
> > On Thu, May 19, 2022 at 3:22 PM Clebert Suconic <
> clebert.suco...@gmail.com> wrote:
> >>
> >> if you widen your focus from the chatbot to a more opened approach
> >> (e.g. Artificial Intelligence transformations, etc... )  you would
> >> have a better chance of building community... on that case your
> >> Leonardo da Vinci could be part of the bigger scope project as an
> >> example of an implementation, while you develop a framework for such
> >> things?
> >>
> >>
> >> If you actually do anything related ao AI, it would be a cool
> >> subject...  I dunno if there are also other communities already doing
> >> something similar, on which case you could also aggregate to an alread
> >> existing project.
> >>
> >> I'm in no way a warlord.. just as Daniel said, one member of the
> >> community   and I just wanted to bring you my 2 cents on a way to
> >> go with your project.
> >>
> >>
> >> On Tue, May 17, 2022 at 1:05 AM m sacks  wrote:
> >> >
> >> > I have some gpt3 based python code to simulate leonardo da vinci as a
> >> > chatbot proof of concept. I think it could be useful, but i am not
> sure, so
> >> > i leave it to the council of ASF warlords and generals to decide if
> the
> >> > code should be incubated?
> >> >
> >> >
> >> > I have not shared sources as of yet.
> >> >
> >> > On Mon, May 16, 2022 at 9:34 PM m sacks  wrote:
> >> >
> >> > > Test 1
> >> > >
> >> > > 2
> >> > >
> >> > > 3
> >> > >
> >> > > 4
> >> > >
> >> > > 5
> >> > >
> >> > > 6
> >> > >
> >> > > 7
> >> > >
> >> > > 8
> >> > >
> >> > > 9
> >> > >
> >> > > 0
> >> > >
> >>
> >>
> >>
> >> --
> >> Clebert Suconic
>
>
>
> --
> Clebert Suconic
>


Re: Proposal

2022-05-19 Thread Clebert Suconic
yep... pretty much

On Thu, May 19, 2022 at 6:33 PM m sacks  wrote:
>
> And if i want to publish it without building community just publish to github 
> if i understand correctly?
>
>
> On Thu, May 19, 2022 at 3:22 PM Clebert Suconic  
> wrote:
>>
>> if you widen your focus from the chatbot to a more opened approach
>> (e.g. Artificial Intelligence transformations, etc... )  you would
>> have a better chance of building community... on that case your
>> Leonardo da Vinci could be part of the bigger scope project as an
>> example of an implementation, while you develop a framework for such
>> things?
>>
>>
>> If you actually do anything related ao AI, it would be a cool
>> subject...  I dunno if there are also other communities already doing
>> something similar, on which case you could also aggregate to an alread
>> existing project.
>>
>> I'm in no way a warlord.. just as Daniel said, one member of the
>> community   and I just wanted to bring you my 2 cents on a way to
>> go with your project.
>>
>>
>> On Tue, May 17, 2022 at 1:05 AM m sacks  wrote:
>> >
>> > I have some gpt3 based python code to simulate leonardo da vinci as a
>> > chatbot proof of concept. I think it could be useful, but i am not sure, so
>> > i leave it to the council of ASF warlords and generals to decide if the
>> > code should be incubated?
>> >
>> >
>> > I have not shared sources as of yet.
>> >
>> > On Mon, May 16, 2022 at 9:34 PM m sacks  wrote:
>> >
>> > > Test 1
>> > >
>> > > 2
>> > >
>> > > 3
>> > >
>> > > 4
>> > >
>> > > 5
>> > >
>> > > 6
>> > >
>> > > 7
>> > >
>> > > 8
>> > >
>> > > 9
>> > >
>> > > 0
>> > >
>>
>>
>>
>> --
>> Clebert Suconic



-- 
Clebert Suconic

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal

2022-05-19 Thread m sacks
And if i want to publish it without building community just publish to
github if i understand correctly?


On Thu, May 19, 2022 at 3:22 PM Clebert Suconic 
wrote:

> if you widen your focus from the chatbot to a more opened approach
> (e.g. Artificial Intelligence transformations, etc... )  you would
> have a better chance of building community... on that case your
> Leonardo da Vinci could be part of the bigger scope project as an
> example of an implementation, while you develop a framework for such
> things?
>
>
> If you actually do anything related ao AI, it would be a cool
> subject...  I dunno if there are also other communities already doing
> something similar, on which case you could also aggregate to an alread
> existing project.
>
> I'm in no way a warlord.. just as Daniel said, one member of the
> community   and I just wanted to bring you my 2 cents on a way to
> go with your project.
>
>
> On Tue, May 17, 2022 at 1:05 AM m sacks  wrote:
> >
> > I have some gpt3 based python code to simulate leonardo da vinci as a
> > chatbot proof of concept. I think it could be useful, but i am not sure,
> so
> > i leave it to the council of ASF warlords and generals to decide if the
> > code should be incubated?
> >
> >
> > I have not shared sources as of yet.
> >
> > On Mon, May 16, 2022 at 9:34 PM m sacks  wrote:
> >
> > > Test 1
> > >
> > > 2
> > >
> > > 3
> > >
> > > 4
> > >
> > > 5
> > >
> > > 6
> > >
> > > 7
> > >
> > > 8
> > >
> > > 9
> > >
> > > 0
> > >
>
>
>
> --
> Clebert Suconic
>


Re: Proposal

2022-05-19 Thread Clebert Suconic
if you widen your focus from the chatbot to a more opened approach
(e.g. Artificial Intelligence transformations, etc... )  you would
have a better chance of building community... on that case your
Leonardo da Vinci could be part of the bigger scope project as an
example of an implementation, while you develop a framework for such
things?


If you actually do anything related ao AI, it would be a cool
subject...  I dunno if there are also other communities already doing
something similar, on which case you could also aggregate to an alread
existing project.

I'm in no way a warlord.. just as Daniel said, one member of the
community   and I just wanted to bring you my 2 cents on a way to
go with your project.


On Tue, May 17, 2022 at 1:05 AM m sacks  wrote:
>
> I have some gpt3 based python code to simulate leonardo da vinci as a
> chatbot proof of concept. I think it could be useful, but i am not sure, so
> i leave it to the council of ASF warlords and generals to decide if the
> code should be incubated?
>
>
> I have not shared sources as of yet.
>
> On Mon, May 16, 2022 at 9:34 PM m sacks  wrote:
>
> > Test 1
> >
> > 2
> >
> > 3
> >
> > 4
> >
> > 5
> >
> > 6
> >
> > 7
> >
> > 8
> >
> > 9
> >
> > 0
> >



-- 
Clebert Suconic

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal

2022-05-19 Thread m sacks
Ftr: Open source is comprised partially of many “private” teams while
adhering to ASF open source standards.


On Wed, May 18, 2022 at 6:59 PM Daniel Widdis  wrote:

> Speaking only for myself, I’ll say I am interested in the open source
> software model and am not interested in participating in any private group.
>
>
>
> Additionally, GPT3 is a closed source model with the only “free” part
> being a public-facing API, with usage primarily benefiting one company.
>
>
>
> I would expect you might get more interest in building a product around
> GPT-J [1], which is an Apache License 2.0 product.  But I also expect most
> people on this list would like to see what is proposed and discussed in the
> open.
>
>
>
> [1] https://github.com/kingoflolz/mesh-transformer-jax/
>
>
>
>
>
>
>
> *From: *m sacks 
> *Date: *Wednesday, May 18, 2022 at 5:58 PM
> *To: *Daniel Widdis , 
> *Subject: *Re: Proposal
>
>
>
> So I’m not sure if this made it either: is there at least one person
> interested in collaborating on this project it involves GPT three?
>
>
>
> On Mon, May 16, 2022 at 10:47 PM m sacks  wrote:
>
> Not sure if this made it:
>
> Just a term of endearment, mot taken to be meant literally.
>
>
>
> Sure.
>
>
>
> Initially the community would be a private group put together by me.
>
>
>
> Then we can discuss building it once others have decided if it’s even a
> useful application first?
>
>
>
> On Mon, May 16, 2022 at 10:20 PM Daniel Widdis  wrote:
>
> I'm not an ASF warlord or general.  In fact, I don't think such things
> exist. It's about community.  Decisions are made by communities.  Warlords,
> generals, and benevolent dictators don't fit well.
>
> Related, I don't see anything "community" in your post. You state "I" have
> got code, not "we".
>
> You can have the best code in the universe, but if you don't have a
> community developing it, it's not really a good fit here.
>
> So tell us less about your code and more about your community developing
> it.
>
>
> On 5/16/22, 10:05 PM, "m sacks"  wrote:
>
> I have some gpt3 based python code to simulate leonardo da vinci as a
> chatbot proof of concept. I think it could be useful, but i am not
> sure, so
> i leave it to the council of ASF warlords and generals to decide if the
> code should be incubated?
>
>
> I have not shared sources as of yet.
>
>
>


Re: Proposal

2022-05-18 Thread Daniel Widdis
Speaking only for myself, I’ll say I am interested in the open source software 
model and am not interested in participating in any private group.

 

Additionally, GPT3 is a closed source model with the only “free” part being a 
public-facing API, with usage primarily benefiting one company.

 

I would expect you might get more interest in building a product around GPT-J 
[1], which is an Apache License 2.0 product.  But I also expect most people on 
this list would like to see what is proposed and discussed in the open.  

 

[1] https://github.com/kingoflolz/mesh-transformer-jax/

 

 

 

From: m sacks 
Date: Wednesday, May 18, 2022 at 5:58 PM
To: Daniel Widdis , 
Subject: Re: Proposal

 

So I’m not sure if this made it either: is there at least one person interested 
in collaborating on this project it involves GPT three?

 

On Mon, May 16, 2022 at 10:47 PM m sacks  wrote:

Not sure if this made it:

Just a term of endearment, mot taken to be meant literally. 

 

Sure. 

 

Initially the community would be a private group put together by me. 

 

Then we can discuss building it once others have decided if it’s even a useful 
application first?

 

On Mon, May 16, 2022 at 10:20 PM Daniel Widdis  wrote:

I'm not an ASF warlord or general.  In fact, I don't think such things exist. 
It's about community.  Decisions are made by communities.  Warlords, generals, 
and benevolent dictators don't fit well.

Related, I don't see anything "community" in your post. You state "I" have got 
code, not "we".

You can have the best code in the universe, but if you don't have a community 
developing it, it's not really a good fit here.

So tell us less about your code and more about your community developing it.


On 5/16/22, 10:05 PM, "m sacks"  wrote:

I have some gpt3 based python code to simulate leonardo da vinci as a
chatbot proof of concept. I think it could be useful, but i am not sure, so
i leave it to the council of ASF warlords and generals to decide if the
code should be incubated?


I have not shared sources as of yet.





Re: Proposal

2022-05-18 Thread m sacks
So I’m not sure if this made it either: is there at least one person
interested in collaborating on this project it involves GPT three?

On Mon, May 16, 2022 at 10:47 PM m sacks  wrote:

> Not sure if this made it:
> Just a term of endearment, mot taken to be meant literally.
>
> Sure.
>
> Initially the community would be a private group put together by me.
>
> Then we can discuss building it once others have decided if it’s even a
> useful application first?
>
> On Mon, May 16, 2022 at 10:20 PM Daniel Widdis  wrote:
>
>> I'm not an ASF warlord or general.  In fact, I don't think such things
>> exist. It's about community.  Decisions are made by communities.  Warlords,
>> generals, and benevolent dictators don't fit well.
>>
>> Related, I don't see anything "community" in your post. You state "I"
>> have got code, not "we".
>>
>> You can have the best code in the universe, but if you don't have a
>> community developing it, it's not really a good fit here.
>>
>> So tell us less about your code and more about your community developing
>> it.
>>
>>
>> On 5/16/22, 10:05 PM, "m sacks"  wrote:
>>
>> I have some gpt3 based python code to simulate leonardo da vinci as a
>> chatbot proof of concept. I think it could be useful, but i am not
>> sure, so
>> i leave it to the council of ASF warlords and generals to decide if
>> the
>> code should be incubated?
>>
>>
>> I have not shared sources as of yet.
>>
>>
>>
>>


Re: Proposal

2022-05-17 Thread Michael Wechner
I agree, but I think we need to give some more concrete guidance how to 
develop an open community and develop in the open, because although it 
is clear to ASF people, I think it is often not so clear for many other 
people


@msacks: Did you already have a look at 
https://incubator.apache.org/cookbook/ ?


HTH

Michael

Am 17.05.22 um 07:56 schrieb Dave Fisher:

Develop an open community and develop in the open. If the community starts to 
grow then come back and ask for guidance in how to make your open source 
community an ASF community.

Did I mention open and community enough?

All the best,
Dave

Sent from my iPhone


On May 16, 2022, at 10:47 PM, m sacks  wrote:

Not sure if this made it:
Just a term of endearment, mot taken to be meant literally.

Sure.

Initially the community would be a private group put together by me.

Then we can discuss building it once others have decided if it’s even a
useful application first?


On Mon, May 16, 2022 at 10:20 PM Daniel Widdis  wrote:

I'm not an ASF warlord or general.  In fact, I don't think such things
exist. It's about community.  Decisions are made by communities.  Warlords,
generals, and benevolent dictators don't fit well.

Related, I don't see anything "community" in your post. You state "I" have
got code, not "we".

You can have the best code in the universe, but if you don't have a
community developing it, it's not really a good fit here.

So tell us less about your code and more about your community developing
it.


On 5/16/22, 10:05 PM, "m sacks"  wrote:

I have some gpt3 based python code to simulate leonardo da vinci as a
chatbot proof of concept. I think it could be useful, but i am not
sure, so
i leave it to the council of ASF warlords and generals to decide if the
code should be incubated?


I have not shared sources as of yet.






-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org




-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal

2022-05-16 Thread Dave Fisher
Develop an open community and develop in the open. If the community starts to 
grow then come back and ask for guidance in how to make your open source 
community an ASF community.

Did I mention open and community enough?

All the best,
Dave

Sent from my iPhone

> On May 16, 2022, at 10:47 PM, m sacks  wrote:
> 
> Not sure if this made it:
> Just a term of endearment, mot taken to be meant literally.
> 
> Sure.
> 
> Initially the community would be a private group put together by me.
> 
> Then we can discuss building it once others have decided if it’s even a
> useful application first?
> 
>> On Mon, May 16, 2022 at 10:20 PM Daniel Widdis  wrote:
>> 
>> I'm not an ASF warlord or general.  In fact, I don't think such things
>> exist. It's about community.  Decisions are made by communities.  Warlords,
>> generals, and benevolent dictators don't fit well.
>> 
>> Related, I don't see anything "community" in your post. You state "I" have
>> got code, not "we".
>> 
>> You can have the best code in the universe, but if you don't have a
>> community developing it, it's not really a good fit here.
>> 
>> So tell us less about your code and more about your community developing
>> it.
>> 
>> 
>> On 5/16/22, 10:05 PM, "m sacks"  wrote:
>> 
>>I have some gpt3 based python code to simulate leonardo da vinci as a
>>chatbot proof of concept. I think it could be useful, but i am not
>> sure, so
>>i leave it to the council of ASF warlords and generals to decide if the
>>code should be incubated?
>> 
>> 
>>I have not shared sources as of yet.
>> 
>> 
>> 
>> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Proposal

2022-05-16 Thread m sacks
Not sure if this made it:
Just a term of endearment, mot taken to be meant literally.

Sure.

Initially the community would be a private group put together by me.

Then we can discuss building it once others have decided if it’s even a
useful application first?

On Mon, May 16, 2022 at 10:20 PM Daniel Widdis  wrote:

> I'm not an ASF warlord or general.  In fact, I don't think such things
> exist. It's about community.  Decisions are made by communities.  Warlords,
> generals, and benevolent dictators don't fit well.
>
> Related, I don't see anything "community" in your post. You state "I" have
> got code, not "we".
>
> You can have the best code in the universe, but if you don't have a
> community developing it, it's not really a good fit here.
>
> So tell us less about your code and more about your community developing
> it.
>
>
> On 5/16/22, 10:05 PM, "m sacks"  wrote:
>
> I have some gpt3 based python code to simulate leonardo da vinci as a
> chatbot proof of concept. I think it could be useful, but i am not
> sure, so
> i leave it to the council of ASF warlords and generals to decide if the
> code should be incubated?
>
>
> I have not shared sources as of yet.
>
>
>
>


Re: Proposal

2022-05-16 Thread Daniel Widdis
I'm not an ASF warlord or general.  In fact, I don't think such things exist. 
It's about community.  Decisions are made by communities.  Warlords, generals, 
and benevolent dictators don't fit well.

Related, I don't see anything "community" in your post. You state "I" have got 
code, not "we".

You can have the best code in the universe, but if you don't have a community 
developing it, it's not really a good fit here.

So tell us less about your code and more about your community developing it.


On 5/16/22, 10:05 PM, "m sacks"  wrote:

I have some gpt3 based python code to simulate leonardo da vinci as a
chatbot proof of concept. I think it could be useful, but i am not sure, so
i leave it to the council of ASF warlords and generals to decide if the
code should be incubated?


I have not shared sources as of yet.




-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal] lxdb - proposal for Apache Incubation

2021-03-05 Thread lidong dai
Hi,
  Kammi’s summary is very comprehensive,  try to open source first. and
you'd better find an experienced mentor to help you, it will be very
helpful !  Good luck


Best Regards
---
DolphinScheduler(Incubator) PPMC
Lidong Dai
dailidon...@gmail.com
---


On Sun, Feb 28, 2021 at 6:52 PM Furkan KAMACI 
wrote:

> Hi,
>
> Actually you have a detailed documentation which explains which approach
> you have compared to similar systems and performance metrics of following
> them i.e. reducing storage 10 to the 100 times or having low latency
> queries.
>
> My advices are (some of them are same with Sheng's and Liang's ):
>
> 1) Find an experienced mentor to guide you.
>
> 2) Start to translate your documentation to English.
>
> 3) Open source your project. How can we have a comment on your project if
> we cannot see anything about it?
>
> 4) Gain contributors to your project. At least you should show your
> intention to have committers/contributors out of your company. Eliminate
> the risk of being non-meritocratic management of the project.
>
> 5) Structure your proposal. Explain why people need this project, which
> problems do current projects have and how you managed to handle them. We
> should understand is it a bundle of other projects, a completely new
> project, or a wrapper of other projects which eliminates the shortcomings
> of them.
>
> 6) Find a suitable name for your project in order to not try to solve
> trademark problems that may lose your time if you enter the incubation.
>
> Kind Regards,
> Furkan KAMACI
>
>
> On Sun, Feb 28, 2021 at 1:02 PM Liang Chen 
> wrote:
>
> > Hi
> >
> > It would be better if you could find an experienced IPMC member to help
> you
> > for preparing the proposal.
> > Based on Sheng Wu input, i have one more comment : can you please explain
> > what are the different with other similar data analysis DB?  you can
> > consider explaining from use cases perspective.
> >
> > Regards
> > Liang
> >
> >
> > fp wrote
> > > Dear Apache Incubator Community,
> > >
> > >
> > > Please accept the following proposal for presentation and discussion:
> > > https://github.com/lucene-cn/lxdb/wiki
> > >
> > >
> > > LXDB is a high-performance,OLAP,full text search database.it`s base on
> > > hbase,but replaced hfile with lucene index to support more effective
> > > secondary indexes,it`s also base on spark sql,so that you can used sql
> > api
> > > to visit data and do olap calculate. and also the lucene index is store
> > on
> > > hdfs (not local disk).
> > >
> > >
> > > In our Production System, LXDB supported 200+ clusters,some of the
> single
> > > cluster is 1000+ nodes,insert 200 billion rows per day ( 2
> > > billion rows for total), one of the biggest single table has 200million
> > > lucene index on LXDB.
> > >
> > >
> > > Hadoop`s father Doug Cutting cut nutch into HBase, MapReduce (hive),
> > HDFS,
> > > Lucene.We have merged these separated projects again,LXDBequals
> > > spark sql+hbase+lucene+parquet+hdfs,it is a super database.It took me
> 10
> > > years to complete these merging operations.But the purpose is no
> longer a
> > > search engine, but a database.
> > >
> > >
> > >
> > >
> > >
> > > Best regards
> > >  yannian mu
> > >
> > >
> > >
> > >
> > > LXDB Proposal
> > > == Abstract ==
> > > LXDB is a high-performance,OLAP,full text search database.
> > >
> > >
> > > === it`s base on hbase,but replaced hfile with lucene index to support
> > > more effective secondary indexes.===
> > > we modify hbase region server ,we change hfile to lucene,when put
> > > data we put document to lucene instande of put data to
> hfile
> > > lucene index store on region server(it is not sote in
> > > different cluster like elstice search+hbase ,it takes to copy of data)
> > >
> > >
> > > === it`s base on spark sql for olap===
> > > we Integrated spark and hbase together ,it`s useage like this ,
> > > 1.unpackage lxdb.tar.gz
> > > 2.config hadoop_config path,
> > > 3.run start-all.sh to start cluster.
> > > lxdb can startup spark through hadoop yarn ,and then spark executor
> > > process Embedded start hbase region server service .
> > >
> > >
> > > you can operate lxdb database throuth spark sql api(hive) or mysql api.
> > > 1.the sql used spark rdd+hbase scaner to visit hbase .
> > > 2.the sql`s condition (filter or group by agg) will predicate to hbase
> ,
> > > 3.hbase used lucene index to filter data in region server.
> > > all of the spark,hbase,lucene is Embedded Integrated together,it is
> > > not a seperate cluster ,that is the different with solr/es
> +
> > > hbase+spark Solution.
> > >
> > >
> > > == Background ==
> > > === Multiple copies of data ===
> > > Apache HBase+Elastic Search is the most popular Solution on full text
> > > search ,but it`s weak on Online AnalyticalProcessing.
> > > so most of the time the Production System used spark(or hive or impala
> or
> > > presto) ,hbase,solr/es at the same time.Multiple copies of data are
> > stored
> > > 

Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-28 Thread Juan Pan
Hi,


My +1 for the suggestions and summary from Furkan KAMACI.
They are truly many IPMC concerns, I guess.
Some of the items will need you plenty of time to handle, 
I am unsure whether it is the best time for you to propose now.
But, at least I suppose you have a direction to improve.


Sincerely,
Trista



---
Email:panj...@apache.org
Juan Pan(Trista) Apache ShardingSphere


On 02/28/2021 18:51,Furkan KAMACI wrote:
Hi,

Actually you have a detailed documentation which explains which approach
you have compared to similar systems and performance metrics of following
them i.e. reducing storage 10 to the 100 times or having low latency
queries.

My advices are (some of them are same with Sheng's and Liang's ):

1) Find an experienced mentor to guide you.

2) Start to translate your documentation to English.

3) Open source your project. How can we have a comment on your project if
we cannot see anything about it?

4) Gain contributors to your project. At least you should show your
intention to have committers/contributors out of your company. Eliminate
the risk of being non-meritocratic management of the project.

5) Structure your proposal. Explain why people need this project, which
problems do current projects have and how you managed to handle them. We
should understand is it a bundle of other projects, a completely new
project, or a wrapper of other projects which eliminates the shortcomings
of them.

6) Find a suitable name for your project in order to not try to solve
trademark problems that may lose your time if you enter the incubation.

Kind Regards,
Furkan KAMACI


On Sun, Feb 28, 2021 at 1:02 PM Liang Chen  wrote:

Hi

It would be better if you could find an experienced IPMC member to help you
for preparing the proposal.
Based on Sheng Wu input, i have one more comment : can you please explain
what are the different with other similar data analysis DB?  you can
consider explaining from use cases perspective.

Regards
Liang


fp wrote
Dear Apache Incubator Community,


Please accept the following proposal for presentation and discussion:
https://github.com/lucene-cn/lxdb/wiki


LXDB is a high-performance,OLAP,full text search database.it`s base on
hbase,but replaced hfile with lucene index to support more effective
secondary indexes,it`s also base on spark sql,so that you can used sql
api
to visit data and do olap calculate. and also the lucene index is store
on
hdfs (not local disk).


In our Production System, LXDB supported 200+ clusters,some of the single
cluster is 1000+ nodes,insert 200 billion rows per day ( 2
billion rows for total), one of the biggest single table has 200million
lucene index on LXDB.


Hadoop`s father Doug Cutting cut nutch into HBase, MapReduce (hive),
HDFS,
Lucene.We have merged these separated projects again,LXDBequals
spark sql+hbase+lucene+parquet+hdfs,it is a super database.It took me 10
years to complete these merging operations.But the purpose is no longer a
search engine, but a database.





Best regards
 yannian mu




LXDB Proposal
== Abstract ==
LXDB is a high-performance,OLAP,full text search database.


=== it`s base on hbase,but replaced hfile with lucene index to support
more effective secondary indexes.===
we modify hbase region server ,we change hfile to lucene,when put
data we put document to lucene instande of put data to hfile
lucene index store on region server(it is not sote in
different cluster like elstice search+hbase ,it takes to copy of data)


=== it`s base on spark sql for olap===
we Integrated spark and hbase together ,it`s useage like this ,
1.unpackage lxdb.tar.gz
2.config hadoop_config path,
3.run start-all.sh to start cluster.
lxdb can startup spark through hadoop yarn ,and then spark executor
process Embedded start hbase region server service .


you can operate lxdb database throuth spark sql api(hive) or mysql api.
1.the sql used spark rdd+hbase scaner to visit hbase .
2.the sql`s condition (filter or group by agg) will predicate to hbase ,
3.hbase used lucene index to filter data in region server.
all of the spark,hbase,lucene is Embedded Integrated together,it is
not a seperate cluster ,that is the different with solr/es +
hbase+spark Solution.


== Background ==
=== Multiple copies of data ===
Apache HBase+Elastic Search is the most popular Solution on full text
search ,but it`s weak on Online AnalyticalProcessing.
so most of the time the Production System used spark(or hive or impala or
presto) ,hbase,solr/es at the same time.Multiple copies of data are
stored
in multiple systems,multiple systems has different Api .Data consistency
is difficult to guarantee.For the above reasons we merger
spark,hbase,elastic into one project .it`s target is used one copy of
data,one cluster,one api to solve olap,kv,full text...database scenarios.


=== Merging and splitting of lucene indexes(hstore) acrocess different
machine on hdfs ===
As we all know solr/es store file in local 

Re: Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-28 Thread f...@lucene.cn
, This is the address of my 
improvement project https://github.com/lucene-cn/lxhadoop
One of the ideas that came to my mind later is to replace the format of parquet 
with the inverted and forward row of Lucene, so that I can carry out multi 
condition full-text retrieval. The multi column feature of parquet allows me to 
avoid the performance problem of random reading by efficiently traversing the 
inverted table

Different from alibaba analytic db
I'm not particularly familiar with analyticdb, so I just looked up some 
information through the search engine. If there is any misunderstanding, please 
criticize and correct me
Most of the time they are really similar,Analyticdb is a very excellent 
database, but its technical principles can hardly be found on the Internet. 
From my personal point of view, they may have the following differences
#1)Analyticdb is a cloud native data warehouse in the full sense,This is also 
the feature they added to the new edition, which supports the separation of 
storage and computing, and the time-sharing flexibility of resources on demand. 
The same piece of data can start different computing resources at different 
computing nodes according to different computing
However, lxdb is not a real cloud native database. Although we store the Lucene 
index on HDFS, we can only separate the storage from computing. At present, 
when the Lucene itself is opened for the first time, the index information such 
as tip must be preloaded into memory, which leads to the persistent opening of 
Lucene in the resident process, Therefore, lxdb has not been able to separate 
computing from computing, that is, it has not been able to distribute computing 
resources to different processes according to different queries. This has 
always been a pity of lxdb, so I have been trying these years
At present, cloud native database has great market potential, and we are 
willing to try it,And I know that it's not difficult to change Lucene like 
this, or it's less difficult than integrating spark, HBase and Lucene together.
#2)Analyticdb can't be built by itself, it can only run on the cloud platform 
provided by it,Must be purchased with the underlying cloud environment, which 
sometimes gives users more restrictions. Lxdb is based on Hadoop platform. As 
long as users have Hadoop environment, lxdb can directly start services through 
yard, which is suitable for private deployment and deployment on the cloud, and 
it doesn't limit any manufacturers. It is relatively open
#3)I feel that it is more like a batch engine,It is more like a scene of 
centralized import and batch query,At least his cloud native model should be 
like this,Or I didn't find the user manual for real-time import
, while lxdb is a real-time engine with low data latency,Relatively speaking, 
it is easier for batch engine to realize cloud native, while it is more 
difficult for real-time millisecond delay engine to realize the separation of 
storage and computing. It needs a snapshot mechanism to record the data change 
at a certain time, so as to realize the separation of computing and computing 
between different nodes
#4 According to the official documents, see specifications and restrictions, 
the best configuration is C32. The number of nodes supported by C32 is less 
than 128, and the storage capacity is 1PB. In the production environment, lxdb 
has 904 nodes, 50pb disk capacity, and 70% storage utilization,Of course, it 
can be inaccurate and unfair to adb.



f...@lucene.cn  yannian mu



f...@lucene.cn  yannian mu
 
From: Ming Wen
Date: 2021-02-28 21:18
To: general
Subject: Re: Re: [Proposal] lxdb - proposal for Apache Incubation
Hi, fp,
Your email is hard to read.
Please change to a normal mail client first.
Back to your proposal, the key concern is not technology, but IPMC can not
evaluate a project when we can see anything.
 
Thanks,
Ming Wen, Apache APISIX PMC Chair
Twitter: _WenMing
 
 
f...@lucene.cn  于2021年2月28日周日 下午9:02写道:
 
> Hi Furkan Kamaci
>
>
> Thank you for your proposal, I will start to improve and prepare
>
>
>
>
> 1.Find an experienced mentor to guide you.
>
>
>
>  todo
>
>
>
> 2.Start to translate your documentation to English.
>
>
>
> 3.Open source your project. How can we have a comment on your project if
>
>
>
> we cannot see anything about it?
>
>
>
>
>
>
>
>  give me some time,I discussed with my team, my English is too poor.
>
>
>
>
>
>
>
> 4) Gain contributors to your project. At least you should show your
>
>
>
> intention to have committers/contributors out of your company. Eliminate
>
>
>
> the risk of being non-meritocratic management of the project.
>
>
>
>
>
>
>
> That's what I have to do
>
>
>
>
>
>
>
> 5) Structure your proposal. Explain why people need this project, which
&

Re: Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-28 Thread Ming Wen
Hi, fp,
Your email is hard to read.
Please change to a normal mail client first.
Back to your proposal, the key concern is not technology, but IPMC can not
evaluate a project when we can see anything.

Thanks,
Ming Wen, Apache APISIX PMC Chair
Twitter: _WenMing


f...@lucene.cn  于2021年2月28日周日 下午9:02写道:

> Hi Furkan Kamaci
>
>
> Thank you for your proposal, I will start to improve and prepare
>
>
>
>
> 1.Find an experienced mentor to guide you.
>
>
>
>  todo
>
>
>
> 2.Start to translate your documentation to English.
>
>
>
> 3.Open source your project. How can we have a comment on your project if
>
>
>
> we cannot see anything about it?
>
>
>
>
>
>
>
>  give me some time,I discussed with my team, my English is too poor.
>
>
>
>
>
>
>
> 4) Gain contributors to your project. At least you should show your
>
>
>
> intention to have committers/contributors out of your company. Eliminate
>
>
>
> the risk of being non-meritocratic management of the project.
>
>
>
>
>
>
>
> That's what I have to do
>
>
>
>
>
>
>
> 5) Structure your proposal. Explain why people need this project, which
>
>
>
> problems do current projects have and how you managed to handle them. We
>
>
>
> should understand is it a bundle of other projects, a completely new
>
>
>
> project, or a wrapper of other projects which eliminates the shortcomings
>
>
>
> of them.
>
>
>
> 6) Find a suitable name for your project in order to not try to solve
>
>
>
> trademark problems that may lose your time if you enter the incubation.
>
>
>
>
>
>
>
> ok i thike a new name ,for example like hydrogen sql
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> f...@lucene.cn  yannian mu
>
>
>
>
>
>
>
> From: Furkan KAMACI
>
>
>
> Date: 2021-02-28 18:51
>
>
>
> To: general
>
>
>
> Subject: Re: [Proposal] lxdb - proposal for Apache Incubation
>
>
>
> Hi,
>
>
>
>
>
>
>
> Actually you have a detailed documentation which explains which approach
>
>
>
> you have compared to similar systems and performance metrics of following
>
>
>
> them i.e. reducing storage 10 to the 100 times or having low latency
>
>
>
> queries.
>
>
>
>
>
>
>
> My advices are (some of them are same with Sheng's and Liang's ):
>
>
>
>
>
>
>
> 1) Find an experienced mentor to guide you.
>
>
>
>
>
>
>
> 2) Start to translate your documentation to English.
>
>
>
>
>
>
>
> 3) Open source your project. How can we have a comment on your project if
>
>
>
> we cannot see anything about it?
>
>
>
>
>
>
>
> 4) Gain contributors to your project. At least you should show your
>
>
>
> intention to have committers/contributors out of your company. Eliminate
>
>
>
> the risk of being non-meritocratic management of the project.
>
>
>
>
>
>
>
> 5) Structure your proposal. Explain why people need this project, which
>
>
>
> problems do current projects have and how you managed to handle them. We
>
>
>
> should understand is it a bundle of other projects, a completely new
>
>
>
> project, or a wrapper of other projects which eliminates the shortcomings
>
>
>
> of them.
>
>
>
>
>
>
>
> 6) Find a suitable name for your project in order to not try to solve
>
>
>
> trademark problems that may lose your time if you enter the incubation.
>
>
>
>
>
>
>
> Kind Regards,
>
>
>
> Furkan KAMACI
>
>
>
>
>
>
>
>
>
>
>
> On Sun, Feb 28, 2021 at 1:02 PM Liang Chen 
> wrote:
>
>
>
>
>
>
>
> > Hi
>
>
>
> >
>
>
>
> > It would be better if you could find an experienced IPMC member to help
> you
>
>
>
> > for preparing the proposal.
>
>
>
> > Based on Sheng Wu input, i have one more comment : can you please explain
>
>
>
> > what are the different with other similar data analysis DB?  you can
>
>
>
> > consider explaining from use cases perspective.
>
>
>
> >
>
>
>
> > Regards
>
>
>
> > Liang
>
>
>
> >
>
>
>
> >
>
>
>
> > fp wrote
>
>
>
> > > Dear Apache Incubator Community,
>
>
>
> > >
>
>
>
> > >
>
>
>
> > > Please accept the following proposal for presentation and discus

Re: Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-28 Thread f...@lucene.cn
Hi Furkan Kamaci


Thank you for your proposal, I will start to improve and prepare




1.Find an experienced mentor to guide you.



     todo



2.Start to translate your documentation to English.



3.Open source your project. How can we have a comment on your project if



we cannot see anything about it?







     give me some time,I discussed with my team, my English is too poor.







4) Gain contributors to your project. At least you should show your



intention to have committers/contributors out of your company. Eliminate



the risk of being non-meritocratic management of the project.







That's what I have to do







5) Structure your proposal. Explain why people need this project, which



problems do current projects have and how you managed to handle them. We



should understand is it a bundle of other projects, a completely new



project, or a wrapper of other projects which eliminates the shortcomings



of them.



6) Find a suitable name for your project in order to not try to solve



trademark problems that may lose your time if you enter the incubation.







ok i thike a new name ,for example like hydrogen sql 















f...@lucene.cn  yannian mu



 



From: Furkan KAMACI



Date: 2021-02-28 18:51



To: general



Subject: Re: [Proposal] lxdb - proposal for Apache Incubation



Hi,



 



Actually you have a detailed documentation which explains which approach



you have compared to similar systems and performance metrics of following



them i.e. reducing storage 10 to the 100 times or having low latency



queries.



 



My advices are (some of them are same with Sheng's and Liang's ):



 



1) Find an experienced mentor to guide you.



 



2) Start to translate your documentation to English.



 



3) Open source your project. How can we have a comment on your project if



we cannot see anything about it?



 



4) Gain contributors to your project. At least you should show your



intention to have committers/contributors out of your company. Eliminate



the risk of being non-meritocratic management of the project.



 



5) Structure your proposal. Explain why people need this project, which



problems do current projects have and how you managed to handle them. We



should understand is it a bundle of other projects, a completely new



project, or a wrapper of other projects which eliminates the shortcomings



of them.



 



6) Find a suitable name for your project in order to not try to solve



trademark problems that may lose your time if you enter the incubation.



 



Kind Regards,



Furkan KAMACI



 



 



On Sun, Feb 28, 2021 at 1:02 PM Liang Chen  wrote:



 



> Hi



>



> It would be better if you could find an experienced IPMC member to help you



> for preparing the proposal.



> Based on Sheng Wu input, i have one more comment : can you please explain



> what are the different with other similar data analysis DB?  you can



> consider explaining from use cases perspective.



>



> Regards



> Liang



>



>



> fp wrote



> > Dear Apache Incubator Community,



> >



> >



> > Please accept the following proposal for presentation and discussion:



> > https://github.com/lucene-cn/lxdb/wiki



> >



> >



> > LXDB is a high-performance,OLAP,full text search database.it`s base on



> > hbase,but replaced hfile with lucene index to support more effective



> > secondary indexes,it`s also base on spark sql,so that you can used sql



> api



> > to visit data and do olap calculate. and also the lucene index is store



> on



> > hdfs (not local disk).



> >



> >



> > In our Production System, LXDB supported 200+ clusters,some of the single



> > cluster is 1000+ nodes,insert 200 billion rows per day ( 2



> > billion rows for total), one of the biggest single table has 200million



> > lucene index on LXDB.



> >



> >



> > Hadoop`s father Doug Cutting cut nutch into HBase, MapReduce (hive),



> HDFS,



> > Lucene.We have merged these separated projects again,LXDBequals



> > spark sql+hbase+lucene+parquet+hdfs,it is a super database.It took me 10



> > years to complete these merging operations.But the purpose is no longer a



> > search engine, but a database.



> >



> >



> >



> >



> >



> > Best regards



> >  yannian mu



> >



> >



> >



> >



> > LXDB Proposal



> > == Abstract ==



> > LXDB is a high-performance,OLAP,full text search database.



> >



> >



> > === it`s base on hbase,but replaced hfile with lucene index to support



> > more effective secondary indexes.===



> > we modify hbase region server ,we change hfile to lucene,

Re: Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-28 Thread f...@lucene.cn
nderlying data structure of 
spark. We can improve the speed of spark by unique data format such as 
index,Whether the data has an index and whether the index is stored on the 
local disk or HDFS is a significant feature that distinguishes us from other 
analytical databases, such as hive, spark SQL, impala and some MAPP 
databases,On this point, we are consistent with carbontata

Our team later spent a certain amount of energy to do a test with carbontata, 
and the positioning in some directions is still very different,

As for Clickhouse, I didn't come across many projects before. Until one day, 
when I was recruiting in the group, someone asked me, is your product as fast 
as Clickhouse? Therefore, I knew that there was such a good product in the 
industry,



#1 Coarse grained index vs fine-grained index, or index stored by block and 
index not stored by block,
We found that the writing speed of carbondata and Clickhouse is very fast, 
while we used lxdb and elastic search at the same time, because both of them 
are based on Lucene, which is an order of magnitude lower than the former two

#2 Later, we found that the main difference lies in the way of index. One is 
the index by block, and the other is the overall global index. The former is 
very fast in storage, and it is easier to separate index and calculation. Even 
carbondata is a real cloud native database (the Clickhouse data is stored 
locally, not cloud native), But the benefit is not only the improvement of 
single column filtering, but also the improvement of multi condition 
combination filtering and the convenience of updating. If the former is not 
handled properly, it is easy to cause full scan, but there will be a high cost 
to realize updating, The latter can be combined with BitSet or bloom filter to 
realize the combination of multi column conditions, and the global index is 
more suitable for updating. Therefore, lxdb and es have the characteristics of 
real-time updating. This is why we are different from carbondata. We inherit a 
HBase in comparison, and the main purpose is to realize the real-time updating 
of kV level, In the future, if lxdb wants to take a step on the cloud native 
Road, it is bound to make some innovations and changes in the index format of 
Lucene

#3 Because lxdb is bound to HBase in the future, OLTP at kV level is also a 
direction in the future
#4 In terms of statistical analysis, the performance of docvalues used by 
Lucene is not as good as that of carbondata and clickhouse,Because of this 
reason, I spent some experience to improve the performance of random reading on 
HDFS, and the speed can be increased by 100-200 times. But I think the code to 
modify HDFS will lead to poor compatibility of our products in the customer 
platform in the future, and will force customers to replace Hadoop with our 
version. I didn't choose this scheme in the end, This is the address of my 
improvement project https://github.com/lucene-cn/lxhadoop
One of the ideas that came to my mind later is to replace the format of parquet 
with the inverted and forward row of Lucene, so that I can carry out multi 
condition full-text retrieval. The multi column feature of parquet allows me to 
avoid the performance problem of random reading by efficiently traversing the 
inverted table



(carbondata在2015年出现的时候,是一个让我非常震惊的产品,给大数据加一层索引,是我这些年一直做的事情,没想到在这个世界上还能有一个团队跟我的想法一样,都是基于hadoop,甚至启动也都是基于spark
 on yarn)
(大家都是基于spark,其核心也都是动了spark底层的数据结构,通过独特的数据格式如索引来达到给spark提升速度的目的,所以是否有索引,以及索引是存储在本地磁盘还是存储在hdfs上是我们区分与其他分析型数据库的一个显著特性,如与hive,spark-sql,impala以及一些mapp数据库,而在这一点上,我们跟carbondata是一致的)

我们团队后来花了一定的精力跟carbondata做了一个测试,在一些方向上的定位,还是有很大的不同

至于clickhouse 
之前我在项目中碰到的并不多,直到有一天,我在群里招聘的时候,有一个人问我,你这个产品有clickhouse快么,因此我才知道业界还有一个这么牛的一个产品

#1粗粒度的索引 vs 细粒度的索引,按块存储的索引与非按块存储的索引
(我们测试发现carbondata与clickhouse的写入速度非常非常的快,而我们同时使用lxdb与elastic search进行测试 
因为两者都是基于lucene,发现比前两者相差一个数量级)
#2(后来我们发现主要差别在索引的方式,一个是按块的索引一个是整体的全局的索引,前者入库速度非常快,而且更容易实现索引与计算分离,甚至carbondata也是一个真正意义上的云原生数据库(clickhouse数据存储在本地,不能是云原生的),而整体的全局的索引需要不断的合并segments会有入库性能损耗.但带来的益处则是不仅仅是在单列筛选过滤上的提升,在多条件组合筛选性能的提升以及更新上的便利,前者处理不好容易导致full
 scan,而要实现更新则会有较大的代价,后者则可以通过结合bitset 或bloom 
filter实现多列条件的组合筛选,全局的索引更适合更新,故lxdb和es则都具备实时更新的特性,这也是为什么我们与carbondata不同的地方,我们对比下多继承了一个hbase进来,主要目的也是为了实现kv层次的实时更新,而未来lxdb如果想在云原生的路上要走一步,势必就要在lucene的索引格式上做一些创新和变更)
#3(而未来lxdb的因为与hbase做了绑定,kv层次的oltp也是未来一个方向)
#4 
在统计分析性能上lucene采用的docvalues大量随机读的表现不如carbondata,因为这个原因,我花了一些经历改进hdfs上的随机读的性能,速度能提升100~200倍,但是我觉得这个要修改hdfs的代码,会导致未来我们产品在客户平台的兼容性不好,会强迫客户将hadoop更换为我们的版本,我最终没有选择这个方案
 ,这个是我改进的项目地址 https://github.com/lucene-cn/lxhadoop
我后来想到的一个思路就是将parquet的格式替换到lucene的倒排与正排上,这样我既能进行多条件的全文检索,在检索的时候parquet多列的特性又能让我通过高效的遍历倒排表来规避随机读的性能问题

















f...@lucene.cn  yannian mu



 



From: Liang Chen



Date: 2021-02-28 18:02



To: general



Subject: Re: [Proposal] lxdb - proposal for Apache Incubation



Hi



 



It would be better if you could find an experienced IPMC member to help you



for preparing the proposal.



Based on Sheng W

Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-28 Thread Furkan KAMACI
Hi,

Actually you have a detailed documentation which explains which approach
you have compared to similar systems and performance metrics of following
them i.e. reducing storage 10 to the 100 times or having low latency
queries.

My advices are (some of them are same with Sheng's and Liang's ):

1) Find an experienced mentor to guide you.

2) Start to translate your documentation to English.

3) Open source your project. How can we have a comment on your project if
we cannot see anything about it?

4) Gain contributors to your project. At least you should show your
intention to have committers/contributors out of your company. Eliminate
the risk of being non-meritocratic management of the project.

5) Structure your proposal. Explain why people need this project, which
problems do current projects have and how you managed to handle them. We
should understand is it a bundle of other projects, a completely new
project, or a wrapper of other projects which eliminates the shortcomings
of them.

6) Find a suitable name for your project in order to not try to solve
trademark problems that may lose your time if you enter the incubation.

Kind Regards,
Furkan KAMACI


On Sun, Feb 28, 2021 at 1:02 PM Liang Chen  wrote:

> Hi
>
> It would be better if you could find an experienced IPMC member to help you
> for preparing the proposal.
> Based on Sheng Wu input, i have one more comment : can you please explain
> what are the different with other similar data analysis DB?  you can
> consider explaining from use cases perspective.
>
> Regards
> Liang
>
>
> fp wrote
> > Dear Apache Incubator Community,
> >
> >
> > Please accept the following proposal for presentation and discussion:
> > https://github.com/lucene-cn/lxdb/wiki
> >
> >
> > LXDB is a high-performance,OLAP,full text search database.it`s base on
> > hbase,but replaced hfile with lucene index to support more effective
> > secondary indexes,it`s also base on spark sql,so that you can used sql
> api
> > to visit data and do olap calculate. and also the lucene index is store
> on
> > hdfs (not local disk).
> >
> >
> > In our Production System, LXDB supported 200+ clusters,some of the single
> > cluster is 1000+ nodes,insert 200 billion rows per day ( 2
> > billion rows for total), one of the biggest single table has 200million
> > lucene index on LXDB.
> >
> >
> > Hadoop`s father Doug Cutting cut nutch into HBase, MapReduce (hive),
> HDFS,
> > Lucene.We have merged these separated projects again,LXDBequals
> > spark sql+hbase+lucene+parquet+hdfs,it is a super database.It took me 10
> > years to complete these merging operations.But the purpose is no longer a
> > search engine, but a database.
> >
> >
> >
> >
> >
> > Best regards
> >  yannian mu
> >
> >
> >
> >
> > LXDB Proposal
> > == Abstract ==
> > LXDB is a high-performance,OLAP,full text search database.
> >
> >
> > === it`s base on hbase,but replaced hfile with lucene index to support
> > more effective secondary indexes.===
> > we modify hbase region server ,we change hfile to lucene,when put
> > data we put document to lucene instande of put data to hfile
> > lucene index store on region server(it is not sote in
> > different cluster like elstice search+hbase ,it takes to copy of data)
> >
> >
> > === it`s base on spark sql for olap===
> > we Integrated spark and hbase together ,it`s useage like this ,
> > 1.unpackage lxdb.tar.gz
> > 2.config hadoop_config path,
> > 3.run start-all.sh to start cluster.
> > lxdb can startup spark through hadoop yarn ,and then spark executor
> > process Embedded start hbase region server service .
> >
> >
> > you can operate lxdb database throuth spark sql api(hive) or mysql api.
> > 1.the sql used spark rdd+hbase scaner to visit hbase .
> > 2.the sql`s condition (filter or group by agg) will predicate to hbase ,
> > 3.hbase used lucene index to filter data in region server.
> > all of the spark,hbase,lucene is Embedded Integrated together,it is
> > not a seperate cluster ,that is the different with solr/es +
> > hbase+spark Solution.
> >
> >
> > == Background ==
> > === Multiple copies of data ===
> > Apache HBase+Elastic Search is the most popular Solution on full text
> > search ,but it`s weak on Online AnalyticalProcessing.
> > so most of the time the Production System used spark(or hive or impala or
> > presto) ,hbase,solr/es at the same time.Multiple copies of data are
> stored
> > in multiple systems,multiple systems has different Api .Data consistency
> > is difficult to guarantee.For the above reasons we merger
> > spark,hbase,elastic into one project .it`s target is used one copy of
> > data,one cluster,one api to solve olap,kv,full text...database scenarios.
> >
> >
> > === Merging and splitting of lucene indexes(hstore) acrocess different
> > machine on hdfs ===
> > As we all know solr/es store file in local fileSystem,it`s shard num must
> > be a fix num,but if we store index on hdfs,the index can split able like
> > hbase hstore,it can split or merge 

Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-28 Thread Liang Chen
Hi

It would be better if you could find an experienced IPMC member to help you
for preparing the proposal.
Based on Sheng Wu input, i have one more comment : can you please explain
what are the different with other similar data analysis DB?  you can
consider explaining from use cases perspective.

Regards
Liang


fp wrote
> Dear Apache Incubator Community,
> 
> 
> Please accept the following proposal for presentation and discussion:
> https://github.com/lucene-cn/lxdb/wiki
> 
> 
> LXDB is a high-performance,OLAP,full text search database.it`s base on
> hbase,but replaced hfile with lucene index to support more effective
> secondary indexes,it`s also base on spark sql,so that you can used sql api
> to visit data and do olap calculate. and also the lucene index is store on
> hdfs (not local disk).
> 
> 
> In our Production System, LXDB supported 200+ clusters,some of the single
> cluster is 1000+ nodes,insert 200 billion rows per day ( 2
> billion rows for total), one of the biggest single table has 200million
> lucene index on LXDB.
> 
> 
> Hadoop`s father Doug Cutting cut nutch into HBase, MapReduce (hive), HDFS,
> Lucene.We have merged these separated projects again,LXDBequals
> spark sql+hbase+lucene+parquet+hdfs,it is a super database.It took me 10
> years to complete these merging operations.But the purpose is no longer a
> search engine, but a database.
> 
> 
> 
> 
> 
> Best regards
>  yannian mu
> 
> 
> 
> 
> LXDB Proposal
> == Abstract ==
> LXDB is a high-performance,OLAP,full text search database.
> 
> 
> === it`s base on hbase,but replaced hfile with lucene index to support
> more effective secondary indexes.===
> we modify hbase region server ,we change hfile to lucene,when put
> data we put document to lucene instande of put data to hfile
> lucene index store on region server(it is not sote in
> different cluster like elstice search+hbase ,it takes to copy of data)
> 
> 
> === it`s base on spark sql for olap===
> we Integrated spark and hbase together ,it`s useage like this ,
> 1.unpackage lxdb.tar.gz
> 2.config hadoop_config path,
> 3.run start-all.sh to start cluster.
> lxdb can startup spark through hadoop yarn ,and then spark executor
> process Embedded start hbase region server service .
> 
> 
> you can operate lxdb database throuth spark sql api(hive) or mysql api.
> 1.the sql used spark rdd+hbase scaner to visit hbase .
> 2.the sql`s condition (filter or group by agg) will predicate to hbase ,
> 3.hbase used lucene index to filter data in region server.
> all of the spark,hbase,lucene is Embedded Integrated together,it is
> not a seperate cluster ,that is the different with solr/es +
> hbase+spark Solution.
> 
> 
> == Background ==
> === Multiple copies of data ===
> Apache HBase+Elastic Search is the most popular Solution on full text
> search ,but it`s weak on Online AnalyticalProcessing.
> so most of the time the Production System used spark(or hive or impala or
> presto) ,hbase,solr/es at the same time.Multiple copies of data are stored
> in multiple systems,multiple systems has different Api .Data consistency
> is difficult to guarantee.For the above reasons we merger
> spark,hbase,elastic into one project .it`s target is used one copy of
> data,one cluster,one api to solve olap,kv,full text...database scenarios.
> 
> 
> === Merging and splitting of lucene indexes(hstore) acrocess different
> machine on hdfs ===
> As we all know solr/es store file in local fileSystem,it`s shard num must
> be a fix num,but if we store index on hdfs,the index can split able like
> hbase hstore,it can split or merge acorss machine nodes ,this is very
> usefull for distribute database ,it depend malloc how much resource on a
> table,most of time the records of a table is different by time by time so
> the num of shards always need adjust,if index store local it can`t split
> acroces throw different machine ,but lucene index store on hdfs it`s can
> do it.
> whether the number of pieces can be flexibly adjusted, whether it has the
> ability of elastic scaling, in a distributed database is particularly
> important
> 
> 
> 
> === solved Insufficient of secondary indexes ===
> some people use hbase secondary index like Phoenix prjoect. but those
> programme base on the hbase rowkey has a lot of redundancy,He can't create
> too many indexes,Data inflation rate is too high,so used lucene index
> instand of secondary is the best chooses.
> 
> 
> === we add an lucene index for spark olap===
> Most of OLAP systems has violent scanning problems and Poor timeliness of
> data like hive,spark sql,impala or some of the mpp database.
> 1.They used violent scans to calculate the data.but another choice is add
> index to the big data.some of the time using index can greatly improve the
> performance of the original brute force scanning. i think that just
> like the traditional database, indexing technology can greatly improve the
> performance of the speed database.
> 2.Another problem of thoses database or system, 

Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-27 Thread Sheng Wu
Base, spark, zookeeper, and does not rely on any code from my previous
> company
> 4:And as you repeated said the original projects, is this project created
> 100% on your own, is it including something from Alibaba/Tencent?
> >the current version of lxdb is 100% created on my own . it isn`t
> including anything form Alibaba/Tencent.
> >The previous version of lxdb relies on the mdrill of Alibaba. I am the
> author of mdrill project and mdrill is an open source project.
> >About Tencent Hermes is my work in Tencent, but after I started my
> business, I didn't use the source code of Hermes, and I informed Tencent
> before I started my business
> 5:As there is no open-source, I can't verify.
> >If you are interested, I can provide the source code to PMC members
> separately for auditing
> 6:Due to this is close-source, we also need you to be clear about whether
> you
> are going to submit SGA and open source to the public.
> >I haven't open source the project yet, mainly to see if PMC is interested
> in my project. If interested, I will open source. In this way, I can
> persuade my investors. If PMC is not interested, I may consider opening
> source later. At present, the project has about 10 lines of code, which
> can be provided to PMC for review
> 7:The most important, `lucene` is an Apache trademark and Apache
> project,this makes me have concerns about the branding violation.
> >I just like Lucene. If the name offends PMC, I can correct it for the
> right name.
> 8:At last, typically, we(incubator) expect you to have open-sourced the
> project, and at least have a small community and first adoption out of your
> company.
> Our company is a commercial company. The community of previous projects
> here may be different from what you said. We have organized a QQ
> communication group with about 1000 people. Many students here have been
> our users for many years, and they are looking forward to the development
> of our project
> 9:To join the incubator, you also need at least 3 IPMC members and 1
> Champion(Apache member or officer) to help you understand the incubator.
> Can you help me? I really have language problems. There is less
> communication in this area. I have done a lot of sharing in China before. I
> hope you can help me if you can.If you like this project, you can also join
> us. It's a very good opportunity in China's database market
> my telnum is 17099831107
>
>
> yannian mu 母延年
> luxin,muyannian
>
>
> -- 原始邮件 --
> *发件人:* "general" ;
> *发送时间:* 2021年2月27日(星期六) 晚上9:06
> *收件人:* "Incubator";
> *主题:* Re: [Proposal] lxdb - proposal for Apache Incubation
>
> Hi
>
> Since you are proposing a new project to a global foundation, you should at
> least keep your documentation in English. Your provided links are Chinese,
> which for most IPMC people, it is not readable.
> And since this project is close-source, please provide the dependencies.
> And as you repeated said the original projects, is this project created
> 100% on your own, is it including something from Alibaba/Tencent? As there
> is no open-source, I can't verify.
> Due to this is close-source, we also need you to be clear about whether you
> are going to submit SGA and open source to the public.
>
> The most important, `lucene` is an Apache trademark and Apache project,
> this makes me have concerns about the branding violation.
>
> At last, typically, we(incubator) expect you to have open-sourced the
> project, and at least have a small community and first adoption out of your
> company.
>
> To join the incubator, you also need at least 3 IPMC members and 1
> Champion(Apache member or officer) to help you understand the incubator.
>
> Sheng Wu 吴晟
> Twitter, wusheng1108
>
>
> fp  于2021年2月27日周六 下午6:40写道:
>
> > Dear Apache Incubator Community,
> >
> >
> > Please accept the following proposal for presentation and discussion:
> > https://github.com/lucene-cn/lxdb/wiki
> >
> >
> > LXDB is a high-performance,OLAP,full text search database.it`s base on
> > hbase,but replaced hfile with lucene index to support more effective
> > secondary indexes,it`s also base on spark sql,so that you can used sql
> api
> > to visit data and do olap calculate. and also the lucene index is store
> on
> > hdfs (not local disk).
> >
> >
> > In our Production System, LXDB supported 200+ clusters,some of the single
> > cluster is 1000+ nodes,insert 200 billion rows per day ( 2
> > billion rows for total), one of the biggest single table has 200million
> > lucene index on LXDB.
> >
> >
> > Hadoop`s father Doug Cutting cut

Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-27 Thread Sheng Wu
Could you change your mail tool?

Your reply looks like this...
1.Since you are proposing a new project to a global foundation, you should
at
least keep your documentation in English.
Of course, if Apache accepts this project, I will complete all the
documents and translate them into English. Although my English is not very
good, many of our company come back from Australia. This should not be a
problem
2:Your provided links are Chinese,which for most IPMC people, it is not
readable.
In addition to the source code, what other documents are needed? Do you
want me to provide some basic project use or introduction first?
3:And since this project is close-source, please provide the dependencies.

Sheng Wu 吴晟
Twitter, wusheng1108


fp  于2021年2月27日周六 下午9:56写道:

> Hi 吴晟
> Thank you for your reply,In response to your question, my answers are as
> follows.(我英语不怎么好请您多多包涵.)
>
>
> 1.Since you are proposing a new project to a global foundation, you should
> at
> least keep your documentation in English.
> Of course, if Apache accepts this project, I will complete all the
> documents and translate them into English. Although my English is not very
> good, many of our company come back from Australia. This should not be a
> problem
> 2:Your provided links are Chinese,which for most IPMC people, it is not
> readable.
> In addition to the source code, what other documents are needed? Do
> you want me to provide some basic project use or introduction first?
> 3:And since this project is close-source, please provide the dependencies.
> The version to be open source is 100% rewritten. It relies on Hadoop,
> HBase, spark, zookeeper, and does not rely on any code from my previous
> company
> 4:And as you repeated said the original projects, is this project created
> 100% on your own, is it including something from Alibaba/Tencent?
> the current version of lxdb is 100% created on my own . it isn`t
> including anything form Alibaba/Tencent.
> The previous version of lxdb relies on the mdrill of Alibaba. I am the
> author of mdrill project and mdrill is an open source project.
> About Tencent Hermes is my work in Tencent, but after I started my
> business, I didn't use the source code of Hermes, and I informed Tencent
> before I started my business
> 5:As there is no open-source, I can't verify.
> If you are interested, I can provide the source code to PMC members
> separately for auditing
> 6:Due to this is close-source, we also need you to be clear about whether
> you
> are going to submit SGA and open source to the public.
> I haven't open source the project yet, mainly to see if PMC is
> interested in my project. If interested, I will open source. In this way, I
> can persuade my investors. If PMC is not interested, I may consider opening
> source later. At present, the project has about 10 lines of code, which
> can be provided to PMC for review
> 7:The most important, `lucene` is an Apache trademark and Apache
> project,this makes me have concerns about the branding violation.
> I just like Lucene. If the name offends PMC, I can correct it for the
> right name.
> 8:At last, typically, we(incubator) expect you to have open-sourced the
> project, and at least have a small community and first adoption out of your
> company.
> Our company is a commercial company. The community of previous projects
> here may be different from what you said. We have organized a QQ
> communication group with about 1000 people. Many students here have been
> our users for many years, and they are looking forward to the development
> of our project
> 9:To join the incubator, you also need at least 3 IPMC members and 1
> Champion(Apache member or officer) to help you understand the incubator.
> Can you help me? I really have language problems. There is less
> communication in this area. I have done a lot of sharing in China before. I
> hope you can help me if you can.If you like this project, you can also join
> us. It's a very good opportunity in China's database market
> my telnum is 17099831107
>
>
>
>
>
>
> yannian mu 母延年
> luxin,muyannian
>
>
>
>
> --原始邮件--
> 发件人:
>   "general"
> <
> wu.sheng.841...@gmail.com;
> 发送时间:2021年2月27日(星期六) 晚上9:06
> 收件人:"Incubator"
> 主题:Re: [Proposal] lxdb - proposal for Apache Incubation
>
>
>
> Hi
>
> Since you are proposing a new project to a global foundation, you should at
> least keep your documentation in English. Your provided links are Chinese,
> which for most IPMC people, it is not readable.
> And since this project is close-source, please provide the dependencies.
>

Re: [Proposal] lxdb - proposal for Apache Incubation

2021-02-27 Thread Sheng Wu
Hi

Since you are proposing a new project to a global foundation, you should at
least keep your documentation in English. Your provided links are Chinese,
which for most IPMC people, it is not readable.
And since this project is close-source, please provide the dependencies.
And as you repeated said the original projects, is this project created
100% on your own, is it including something from Alibaba/Tencent? As there
is no open-source, I can't verify.
Due to this is close-source, we also need you to be clear about whether you
are going to submit SGA and open source to the public.

The most important, `lucene` is an Apache trademark and Apache project,
this makes me have concerns about the branding violation.

At last, typically, we(incubator) expect you to have open-sourced the
project, and at least have a small community and first adoption out of your
company.

To join the incubator, you also need at least 3 IPMC members and 1
Champion(Apache member or officer) to help you understand the incubator.

Sheng Wu 吴晟
Twitter, wusheng1108


fp  于2021年2月27日周六 下午6:40写道:

> Dear Apache Incubator Community,
>
>
> Please accept the following proposal for presentation and discussion:
> https://github.com/lucene-cn/lxdb/wiki
>
>
> LXDB is a high-performance,OLAP,full text search database.it`s base on
> hbase,but replaced hfile with lucene index to support more effective
> secondary indexes,it`s also base on spark sql,so that you can used sql api
> to visit data and do olap calculate. and also the lucene index is store on
> hdfs (not local disk).
>
>
> In our Production System, LXDB supported 200+ clusters,some of the single
> cluster is 1000+ nodes,insert 200 billion rows per day ( 2
> billion rows for total), one of the biggest single table has 200million
> lucene index on LXDB.
>
>
> Hadoop`s father Doug Cutting cut nutch into HBase, MapReduce (hive), HDFS,
> Lucene.We have merged these separated projects again,LXDB equals spark
> sql+hbase+lucene+parquet+hdfs,it is a super database.It took me 10 years to
> complete these merging operations.But the purpose is no longer a search
> engine, but a database.
>
>
>
>
> Best regards
>  yannian mu
>
>
>
>
> LXDB Proposal
> == Abstract ==
> LXDB is a high-performance,OLAP,full text search database.
>
>
> === it`s base on hbase,but replaced hfile with lucene index to support
> more effective secondary indexes.===
> we modify hbase region server ,we change hfile to lucene,when put
> data we put document to lucene instande of put data to hfile
> lucene index store on region server (it is not sote in different
> cluster like elstice search+hbase ,it takes to copy of data)
>
>
> === it`s base on spark sql for olap===
> we Integrated spark and hbase together ,it`s useage like this ,
> 1.unpackage lxdb.tar.gz
> 2.config hadoop_config path,
> 3.run start-all.sh to start cluster.
> lxdb can startup spark through hadoop yarn ,and then spark executor
> process Embedded start hbase region server service .
>
>
> you can operate lxdb database throuth spark sql api(hive) or mysql api.
> 1.the sql used spark rdd+hbase scaner to visit hbase .
> 2.the sql`s condition (filter or group by agg) will predicate to hbase ,
> 3.hbase used lucene index to filter data in region server.
> all of the spark,hbase,lucene is Embedded Integrated together,it is
> not a seperate cluster ,that is the different with solr/es +
> hbase+spark Solution.
>
>
> == Background ==
> === Multiple copies of data ===
> Apache HBase+Elastic Search is the most popular Solution on full text
> search ,but it`s weak on Online AnalyticalProcessing.
> so most of the time the Production System used spark(or hive or impala or
> presto) ,hbase,solr/es at the same time.Multiple copies of data are stored
> in multiple systems,multiple systems has different Api .Data consistency is
> difficult to guarantee.For the above reasons we merger spark,hbase,elastic
> into one project .it`s target is used one copy of data,one cluster,one api
> to solve olap,kv,full text...database scenarios.
>
>
> === Merging and splitting of lucene indexes(hstore) acrocess different
> machine on hdfs ===
> As we all know solr/es store file in local fileSystem,it`s shard num must
> be a fix num,but if we store index on hdfs,the index can split able like
> hbase hstore,it can split or merge acorss machine nodes ,this is very
> usefull for distribute database ,it depend malloc how much resource on a
> table,most of time the records of a table is different by time by time so
> the num of shards always need adjust,if index store local it can`t split
> acroces throw different machine ,but lucene index store on hdfs it`s can do
> it.
> whether the number of pieces can be flexibly adjusted, whether it has the
> ability of elastic scaling, in a distributed database is particularly
> important
>
>
> === solved Insufficient of secondary indexes ===
> some people use hbase secondary index like Phoenix prjoect. but those
> programme base on the hbase rowkey has a lot 

Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-10 Thread Tao Wu
Welcome, Liang! Thank you for offering to help. Now we have enough mentors.

Liang Chen  于2020年6月10日周三 下午10:46写道:

> Hi Tao Wu
>
> I am willing to be a mentor. I am familiar with big data , can give some
> help for incubating.
>
>
> Regards
> Liang
>
> Justin Mclean wrote
> > Hi,
> >
> >> Thank you for your interest, Kevin. Since now we have 2 mentors, shall
> we
> >> call for a vote?
> >
> > Given Kevin said he only 1/2 mentor it would be preferable if you had one
> > more mentor. Anyone willing to help out?
> >
> > Thanks,
> > Justin
> > -
> > To unsubscribe, e-mail:
>
> > general-unsubscribe@.apache
>
> > For additional commands, e-mail:
>
> > general-help@.apache
>
>
>
>
>
> --
> Sent from: http://apache-incubator-general.996316.n3.nabble.com/
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-10 Thread Liang Chen
Hi Tao Wu

I am willing to be a mentor. I am familiar with big data , can give some
help for incubating.


Regards
Liang

Justin Mclean wrote
> Hi,
> 
>> Thank you for your interest, Kevin. Since now we have 2 mentors, shall we
>> call for a vote?
> 
> Given Kevin said he only 1/2 mentor it would be preferable if you had one
> more mentor. Anyone willing to help out?
> 
> Thanks,
> Justin
> -
> To unsubscribe, e-mail: 

> general-unsubscribe@.apache

> For additional commands, e-mail: 

> general-help@.apache





--
Sent from: http://apache-incubator-general.996316.n3.nabble.com/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-09 Thread Justin Mclean
Hi,

> Thank you for your interest, Kevin. Since now we have 2 mentors, shall we
> call for a vote?

Given Kevin said he only 1/2 mentor it would be preferable if you had one more 
mentor. Anyone willing to help out?

Thanks,
Justin
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-09 Thread Tao Wu
Thank you for your interest, Kevin. Since now we have 2 mentors, shall we
call for a vote?
@Gosling Von 

Kevin A. McGrail  于2020年6月10日周三 上午2:41写道:

> While I am honored to be asked, I am mentoring actively a number of
> projects right now with another one being proposed.
> Please count me as a mentor but seek out others and consider me in reality
> to be a 1/2 a mentor :-)
>
> --
> Kevin A. McGrail
> Member, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail - 703.798.0171
>
>
> On Mon, Jun 8, 2020 at 4:13 AM 吴涛  wrote:
>
> > Thanks, Kevin. I'd very much appreciate it if you can become one of our
> > mentors.
> >
> > Regards
> >
> > Tao.
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>


Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-09 Thread Kevin A. McGrail
While I am honored to be asked, I am mentoring actively a number of
projects right now with another one being proposed.
Please count me as a mentor but seek out others and consider me in reality
to be a 1/2 a mentor :-)

--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Mon, Jun 8, 2020 at 4:13 AM 吴涛  wrote:

> Thanks, Kevin. I'd very much appreciate it if you can become one of our
> mentors.
>
> Regards
>
> Tao.
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-08 Thread Justin Mclean
Hi,

The ASF allows 3rd party code to be included in releases. It wouldn’t be part 
of the grant and still remain the original headers and  copyright. The ASF also 
doesn’t fork other communities code, but given this is unmaintained and no 
longer in active development [1] from what I can see I don’t any issues here.

Thanks,
Justin

1. https://github.com/microsoft/rDSN
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-08 Thread Tao Wu
Sure, Sheng. We certainly don't want to steal any IP from anyone. Sorry if
I didn't make it clear.

My question is, are we legal enough to paste copyright of Microsoft on
every file that originates from them, as we always do? As far as I know, it
suffices to prevent legal problems.

But I'm not familiar that if ASF has any restriction that requires all the
code except 3rd-parties are owned by ASF.


Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-08 Thread Duo Zhang
Since MIT is compatible with Apache 2.0 so I do not think this is a blocker
issue? We can even bundle the rdsn code into the pegasus main repository I
think.
We can solve this problem during the incubating time.
Of course it will be good if microsoft would like to donate the code.

Sheng Wu  于2020年6月8日周一 下午6:00写道:

> 吴涛  于2020年6月8日周一 下午5:22写道:
>
> > Hi, Willem,
> >
> > This is the historical issue we are going to resolve recently.
> >
> > We've reduced the incompatible modifications on facebook/rocksdb [1]
> where
> > pegasus-rocksdb was forked, and I believe we will soon no longer need to
> > maintain
> > this repo. It will be completely used as an external dependency.
> >
> > As for rdsn, since the original repo (fully MIT-licensed)
> > microsoft/rDSN [2] has been
> > unmaintained for a long time, we plan to merge the two repo together.
> We've
> > endeavored a lot
> > to improve and refactor rdsn. I'm not sure if we should ask microsoft for
> > any CLA for the
> > donation of our code.
> >
>
> I am afraid you need to donate this, you have to. Unmaintained can't change
> the fact, its IP belongs to the original team.
>
> Sheng Wu 吴晟
> Twitter, wusheng1108
>
>
> >
> > [1] https://github.com/facebook/rocksdb
> > [2] https://github.com/XiaoMi/rdsn
> >
> > Willem Jiang  于2020年6月8日周一 下午4:12写道:
> >
> > > Hi,
> > >
> > > I just went through the proposal and found there are two source code
> > > repos[1][2] which are forked.
> > >
> > > Are you planning to donate these two repos into Apache as a part of
> > > Pegasus project?
> > > It makes the donation of the Pegasus complicated as we need to address
> > > this two code repo belonging issue first.
> > > Normally we just contribute the patch to the upstream as it consume
> > > lots of resources if we maintain the forked repo.
> > >
> > > [1] https://github.com/XiaoMi/rdsn
> > > [2] https://github.com/XiaoMi/pegasus-rocksdb
> > >
> > >
> > > Willem Jiang
> > >
> > > Twitter: willemjiang
> > > Weibo: 姜宁willem
> > >
> > > On Tue, Jun 2, 2020 at 3:49 PM 吴涛 
> > wrote:
> > > >
> > > > Dear Apache Incubator Community,
> > > >
> > > > I'd like to open up a discussion about incubating Pegasus at Apache.
> > Our
> > > proposal can be found at
> https://pegasus-kv.github.io/community/proposal
> > > and is also included below.
> > > >
> > > > We are looking for possible Champion if anyone would like to
> volunteer.
> > > Thanks a lot!
> > > >
> > > > Best regards
> > > >   Tao Wu
> > > >
> > > > Pegasus Proposal
> > > >
> > > > == Abstract ==
> > > >
> > > > Pegasus is a distributed key-value storage system that is designed to
> > be
> > > horizontally scalable, strongly consistent and high-performance.
> > > >
> > > > - Pegasus codebase: https://github.com/XiaoMi/pegasus
> > > > - Website: https://pegasus-kv.github.io
> > > >
> > > > == Proposal ==
> > > >
> > > > Pegasus is a key-value database that delivers low-latency data access
> > > together with horizontal scalability, using hash-based partitioning.
> > > Pegasus uses PacificA protocol for strong consistency and RocksDB as
> the
> > > underlying storage engine.
> > > >
> > > > We propose to contribute the Pegasus codebase and associated
> artifacts
> > > (e.g., documentation, website content, etc.) to the Apache Software
> > > Foundation, and aim to build an open community around Pegasus’s
> continued
> > > development in the ‘Apache Way’.
> > > >
> > > > == Background ==
> > > >
> > > > Apache HBase was recognized as mostly the only large-scale KV store
> > > solution in XiaoMi Corp until Pegasus came out in 2015. The original
> > > purpose of Pegasus was to solve the problems caused by HBase’s
> two-level
> > > architecture and implementation, including high latency because of Java
> > GC
> > > and RPC overhead of the underlying distributed filesystem, and long
> > > failover time because of single point of RegionServer and recovery
> > overhead
> > > of splitting and replaying the HLog files.
> > > >
> > > > Pegasus aims to fill the gap between Redis and HBase. As the former
> is
> > > in-memory, low latency, but does not provide a strong-consistency
> > > guarantee. And unlike the latter, Pegasus server is entirely written in
> > C++
> > > and its read-write path relies merely on the local filesystem.
> > > >
> > > > Apart from performance requirements, we also need a storage system to
> > > ensure multiple-level data safety and support fast data migration among
> > > data centers, automatic load balancing, and online partition splitting.
> > > >
> > > > After investigating lots of existing storage systems in the open
> source
> > > world, we could hardly find a suitable solution to satisfy all the
> > > requirements. So the journey of Pegasus begins.
> > > >
> > > > === Rationale ===
> > > >
> > > > Pegasus is a mature and active project which has been widely adopted
> in
> > > XiaoMi. After the initial release of open source project in 2017, we
> have
> > > seen a great amount of interest across a diverse 

Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-08 Thread Sheng Wu
吴涛  于2020年6月8日周一 下午5:22写道:

> Hi, Willem,
>
> This is the historical issue we are going to resolve recently.
>
> We've reduced the incompatible modifications on facebook/rocksdb [1] where
> pegasus-rocksdb was forked, and I believe we will soon no longer need to
> maintain
> this repo. It will be completely used as an external dependency.
>
> As for rdsn, since the original repo (fully MIT-licensed)
> microsoft/rDSN [2] has been
> unmaintained for a long time, we plan to merge the two repo together. We've
> endeavored a lot
> to improve and refactor rdsn. I'm not sure if we should ask microsoft for
> any CLA for the
> donation of our code.
>

I am afraid you need to donate this, you have to. Unmaintained can't change
the fact, its IP belongs to the original team.

Sheng Wu 吴晟
Twitter, wusheng1108


>
> [1] https://github.com/facebook/rocksdb
> [2] https://github.com/XiaoMi/rdsn
>
> Willem Jiang  于2020年6月8日周一 下午4:12写道:
>
> > Hi,
> >
> > I just went through the proposal and found there are two source code
> > repos[1][2] which are forked.
> >
> > Are you planning to donate these two repos into Apache as a part of
> > Pegasus project?
> > It makes the donation of the Pegasus complicated as we need to address
> > this two code repo belonging issue first.
> > Normally we just contribute the patch to the upstream as it consume
> > lots of resources if we maintain the forked repo.
> >
> > [1] https://github.com/XiaoMi/rdsn
> > [2] https://github.com/XiaoMi/pegasus-rocksdb
> >
> >
> > Willem Jiang
> >
> > Twitter: willemjiang
> > Weibo: 姜宁willem
> >
> > On Tue, Jun 2, 2020 at 3:49 PM 吴涛 
> wrote:
> > >
> > > Dear Apache Incubator Community,
> > >
> > > I'd like to open up a discussion about incubating Pegasus at Apache.
> Our
> > proposal can be found at https://pegasus-kv.github.io/community/proposal
> > and is also included below.
> > >
> > > We are looking for possible Champion if anyone would like to volunteer.
> > Thanks a lot!
> > >
> > > Best regards
> > >   Tao Wu
> > >
> > > Pegasus Proposal
> > >
> > > == Abstract ==
> > >
> > > Pegasus is a distributed key-value storage system that is designed to
> be
> > horizontally scalable, strongly consistent and high-performance.
> > >
> > > - Pegasus codebase: https://github.com/XiaoMi/pegasus
> > > - Website: https://pegasus-kv.github.io
> > >
> > > == Proposal ==
> > >
> > > Pegasus is a key-value database that delivers low-latency data access
> > together with horizontal scalability, using hash-based partitioning.
> > Pegasus uses PacificA protocol for strong consistency and RocksDB as the
> > underlying storage engine.
> > >
> > > We propose to contribute the Pegasus codebase and associated artifacts
> > (e.g., documentation, website content, etc.) to the Apache Software
> > Foundation, and aim to build an open community around Pegasus’s continued
> > development in the ‘Apache Way’.
> > >
> > > == Background ==
> > >
> > > Apache HBase was recognized as mostly the only large-scale KV store
> > solution in XiaoMi Corp until Pegasus came out in 2015. The original
> > purpose of Pegasus was to solve the problems caused by HBase’s two-level
> > architecture and implementation, including high latency because of Java
> GC
> > and RPC overhead of the underlying distributed filesystem, and long
> > failover time because of single point of RegionServer and recovery
> overhead
> > of splitting and replaying the HLog files.
> > >
> > > Pegasus aims to fill the gap between Redis and HBase. As the former is
> > in-memory, low latency, but does not provide a strong-consistency
> > guarantee. And unlike the latter, Pegasus server is entirely written in
> C++
> > and its read-write path relies merely on the local filesystem.
> > >
> > > Apart from performance requirements, we also need a storage system to
> > ensure multiple-level data safety and support fast data migration among
> > data centers, automatic load balancing, and online partition splitting.
> > >
> > > After investigating lots of existing storage systems in the open source
> > world, we could hardly find a suitable solution to satisfy all the
> > requirements. So the journey of Pegasus begins.
> > >
> > > === Rationale ===
> > >
> > > Pegasus is a mature and active project which has been widely adopted in
> > XiaoMi. After the initial release of open source project in 2017, we have
> > seen a great amount of interest across a diverse set of users and
> companies.
> > >
> > > Our experiences at committers and PMC members on other Apache projects
> > have convinced us that having a long-term home at Apache foundation would
> > be a great fit for the project, to ensure that processes and procedures
> are
> > in place to keep project and community ‘healthy’ and free of any
> > commercial, political or legal faults.
> > >
> > > === Initial Goal ===
> > >
> > > Move the existing codebase, website, documentation, and mailing lists
> to
> > Apache-hosted infrastructure.
> > > Work with the infrastructure team to 

Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-08 Thread 吴涛
Hi, Willem,

This is the historical issue we are going to resolve recently.

We've reduced the incompatible modifications on facebook/rocksdb [1] where
pegasus-rocksdb was forked, and I believe we will soon no longer need to
maintain
this repo. It will be completely used as an external dependency.

As for rdsn, since the original repo (fully MIT-licensed)
microsoft/rDSN [2] has been
unmaintained for a long time, we plan to merge the two repo together. We've
endeavored a lot
to improve and refactor rdsn. I'm not sure if we should ask microsoft for
any CLA for the
donation of our code.

[1] https://github.com/facebook/rocksdb
[2] https://github.com/XiaoMi/rdsn

Willem Jiang  于2020年6月8日周一 下午4:12写道:

> Hi,
>
> I just went through the proposal and found there are two source code
> repos[1][2] which are forked.
>
> Are you planning to donate these two repos into Apache as a part of
> Pegasus project?
> It makes the donation of the Pegasus complicated as we need to address
> this two code repo belonging issue first.
> Normally we just contribute the patch to the upstream as it consume
> lots of resources if we maintain the forked repo.
>
> [1] https://github.com/XiaoMi/rdsn
> [2] https://github.com/XiaoMi/pegasus-rocksdb
>
>
> Willem Jiang
>
> Twitter: willemjiang
> Weibo: 姜宁willem
>
> On Tue, Jun 2, 2020 at 3:49 PM 吴涛  wrote:
> >
> > Dear Apache Incubator Community,
> >
> > I'd like to open up a discussion about incubating Pegasus at Apache. Our
> proposal can be found at https://pegasus-kv.github.io/community/proposal
> and is also included below.
> >
> > We are looking for possible Champion if anyone would like to volunteer.
> Thanks a lot!
> >
> > Best regards
> >   Tao Wu
> >
> > Pegasus Proposal
> >
> > == Abstract ==
> >
> > Pegasus is a distributed key-value storage system that is designed to be
> horizontally scalable, strongly consistent and high-performance.
> >
> > - Pegasus codebase: https://github.com/XiaoMi/pegasus
> > - Website: https://pegasus-kv.github.io
> >
> > == Proposal ==
> >
> > Pegasus is a key-value database that delivers low-latency data access
> together with horizontal scalability, using hash-based partitioning.
> Pegasus uses PacificA protocol for strong consistency and RocksDB as the
> underlying storage engine.
> >
> > We propose to contribute the Pegasus codebase and associated artifacts
> (e.g., documentation, website content, etc.) to the Apache Software
> Foundation, and aim to build an open community around Pegasus’s continued
> development in the ‘Apache Way’.
> >
> > == Background ==
> >
> > Apache HBase was recognized as mostly the only large-scale KV store
> solution in XiaoMi Corp until Pegasus came out in 2015. The original
> purpose of Pegasus was to solve the problems caused by HBase’s two-level
> architecture and implementation, including high latency because of Java GC
> and RPC overhead of the underlying distributed filesystem, and long
> failover time because of single point of RegionServer and recovery overhead
> of splitting and replaying the HLog files.
> >
> > Pegasus aims to fill the gap between Redis and HBase. As the former is
> in-memory, low latency, but does not provide a strong-consistency
> guarantee. And unlike the latter, Pegasus server is entirely written in C++
> and its read-write path relies merely on the local filesystem.
> >
> > Apart from performance requirements, we also need a storage system to
> ensure multiple-level data safety and support fast data migration among
> data centers, automatic load balancing, and online partition splitting.
> >
> > After investigating lots of existing storage systems in the open source
> world, we could hardly find a suitable solution to satisfy all the
> requirements. So the journey of Pegasus begins.
> >
> > === Rationale ===
> >
> > Pegasus is a mature and active project which has been widely adopted in
> XiaoMi. After the initial release of open source project in 2017, we have
> seen a great amount of interest across a diverse set of users and companies.
> >
> > Our experiences at committers and PMC members on other Apache projects
> have convinced us that having a long-term home at Apache foundation would
> be a great fit for the project, to ensure that processes and procedures are
> in place to keep project and community ‘healthy’ and free of any
> commercial, political or legal faults.
> >
> > === Initial Goal ===
> >
> > Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure.
> > Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF.
> > Incremental development and releases along with Apache guidelines.
> >
> > == Current Status ==
> >
> > Pegasus has been an open-source project on GitHub
> https://github.com/XiaoMi/pegasus since October 2017.
> >
> > === Meritocracy ===
> >
> > The intent of this proposal is to start building a diverse developer and
> user community around 

Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-08 Thread 吴涛
Thanks, Kevin. I'd very much appreciate it if you can become one of our mentors.

Regards

Tao.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-08 Thread Willem Jiang
Hi,

I just went through the proposal and found there are two source code
repos[1][2] which are forked.

Are you planning to donate these two repos into Apache as a part of
Pegasus project?
It makes the donation of the Pegasus complicated as we need to address
this two code repo belonging issue first.
Normally we just contribute the patch to the upstream as it consume
lots of resources if we maintain the forked repo.

[1] https://github.com/XiaoMi/rdsn
[2] https://github.com/XiaoMi/pegasus-rocksdb


Willem Jiang

Twitter: willemjiang
Weibo: 姜宁willem

On Tue, Jun 2, 2020 at 3:49 PM 吴涛  wrote:
>
> Dear Apache Incubator Community,
>
> I'd like to open up a discussion about incubating Pegasus at Apache. Our 
> proposal can be found at https://pegasus-kv.github.io/community/proposal and 
> is also included below.
>
> We are looking for possible Champion if anyone would like to volunteer. 
> Thanks a lot!
>
> Best regards
>   Tao Wu
>
> Pegasus Proposal
>
> == Abstract ==
>
> Pegasus is a distributed key-value storage system that is designed to be 
> horizontally scalable, strongly consistent and high-performance.
>
> - Pegasus codebase: https://github.com/XiaoMi/pegasus
> - Website: https://pegasus-kv.github.io
>
> == Proposal ==
>
> Pegasus is a key-value database that delivers low-latency data access 
> together with horizontal scalability, using hash-based partitioning. Pegasus 
> uses PacificA protocol for strong consistency and RocksDB as the underlying 
> storage engine.
>
> We propose to contribute the Pegasus codebase and associated artifacts (e.g., 
> documentation, website content, etc.) to the Apache Software Foundation, and 
> aim to build an open community around Pegasus’s continued development in the 
> ‘Apache Way’.
>
> == Background ==
>
> Apache HBase was recognized as mostly the only large-scale KV store solution 
> in XiaoMi Corp until Pegasus came out in 2015. The original purpose of 
> Pegasus was to solve the problems caused by HBase’s two-level architecture 
> and implementation, including high latency because of Java GC and RPC 
> overhead of the underlying distributed filesystem, and long failover time 
> because of single point of RegionServer and recovery overhead of splitting 
> and replaying the HLog files.
>
> Pegasus aims to fill the gap between Redis and HBase. As the former is 
> in-memory, low latency, but does not provide a strong-consistency guarantee. 
> And unlike the latter, Pegasus server is entirely written in C++ and its 
> read-write path relies merely on the local filesystem.
>
> Apart from performance requirements, we also need a storage system to ensure 
> multiple-level data safety and support fast data migration among data 
> centers, automatic load balancing, and online partition splitting.
>
> After investigating lots of existing storage systems in the open source 
> world, we could hardly find a suitable solution to satisfy all the 
> requirements. So the journey of Pegasus begins.
>
> === Rationale ===
>
> Pegasus is a mature and active project which has been widely adopted in 
> XiaoMi. After the initial release of open source project in 2017, we have 
> seen a great amount of interest across a diverse set of users and companies.
>
> Our experiences at committers and PMC members on other Apache projects have 
> convinced us that having a long-term home at Apache foundation would be a 
> great fit for the project, to ensure that processes and procedures are in 
> place to keep project and community ‘healthy’ and free of any commercial, 
> political or legal faults.
>
> === Initial Goal ===
>
> Move the existing codebase, website, documentation, and mailing lists to 
> Apache-hosted infrastructure.
> Work with the infrastructure team to implement and approve our code review, 
> build, and testing workflows in the context of the ASF.
> Incremental development and releases along with Apache guidelines.
>
> == Current Status ==
>
> Pegasus has been an open-source project on GitHub 
> https://github.com/XiaoMi/pegasus since October 2017.
>
> === Meritocracy ===
>
> The intent of this proposal is to start building a diverse developer and user 
> community around Pegasus following the ASF meritocracy model. We plan to 
> invite more people as committers if they contribute to this project.
>
> === Releases ===
>
> Pegasus has undergone multiple public releases, listed here: 
> https://github.com/XiaoMi/pegasus/releases.
>
> These old releases were not performed in the typical ASF fashion. We will 
> adopt the ASF source release process upon joining the incubator.
>
> === Code Reviews ===
>
> Pegasus’s code reviews are currently public on Github 
> https://github.com/XiaoMi/pegasus/pulls.
>
> === Community ===
>
> Pegasus seeks to develop developer and user communities during incubation.
>
> === Core Developers ===
>
> Currently most of the core developers of Pegasus are working in the 
> KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC 
> 

Re: [Proposal] Pegasus - proposal for Apache Incubation

2020-06-06 Thread Kevin A. McGrail
This looks very interesting.  I've used Redis a long time and Pegasus looks
very interesting.

I'd like to see a champion and some mentors but otherwise I really like
what I see here.

Regards,
KAM
--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Tue, Jun 2, 2020 at 3:49 AM 吴涛  wrote:

> Dear Apache Incubator Community,
>
> I'd like to open up a discussion about incubating Pegasus at Apache. Our
> proposal can be found at https://pegasus-kv.github.io/community/proposal
> and is also included below.
>
> We are looking for possible Champion if anyone would like to volunteer.
> Thanks a lot!
>
> Best regards
>   Tao Wu
>
> Pegasus Proposal
>
> == Abstract ==
>
> Pegasus is a distributed key-value storage system that is designed to be
> horizontally scalable, strongly consistent and high-performance.
>
> - Pegasus codebase: https://github.com/XiaoMi/pegasus
> - Website: https://pegasus-kv.github.io
>
> == Proposal ==
>
> Pegasus is a key-value database that delivers low-latency data access
> together with horizontal scalability, using hash-based partitioning.
> Pegasus uses PacificA protocol for strong consistency and RocksDB as the
> underlying storage engine.
>
> We propose to contribute the Pegasus codebase and associated artifacts
> (e.g., documentation, website content, etc.) to the Apache Software
> Foundation, and aim to build an open community around Pegasus’s continued
> development in the ‘Apache Way’.
>
> == Background ==
>
> Apache HBase was recognized as mostly the only large-scale KV store
> solution in XiaoMi Corp until Pegasus came out in 2015. The original
> purpose of Pegasus was to solve the problems caused by HBase’s two-level
> architecture and implementation, including high latency because of Java GC
> and RPC overhead of the underlying distributed filesystem, and long
> failover time because of single point of RegionServer and recovery overhead
> of splitting and replaying the HLog files.
>
> Pegasus aims to fill the gap between Redis and HBase. As the former is
> in-memory, low latency, but does not provide a strong-consistency
> guarantee. And unlike the latter, Pegasus server is entirely written in C++
> and its read-write path relies merely on the local filesystem.
>
> Apart from performance requirements, we also need a storage system to
> ensure multiple-level data safety and support fast data migration among
> data centers, automatic load balancing, and online partition splitting.
>
> After investigating lots of existing storage systems in the open source
> world, we could hardly find a suitable solution to satisfy all the
> requirements. So the journey of Pegasus begins.
>
> === Rationale ===
>
> Pegasus is a mature and active project which has been widely adopted in
> XiaoMi. After the initial release of open source project in 2017, we have
> seen a great amount of interest across a diverse set of users and companies.
>
> Our experiences at committers and PMC members on other Apache projects
> have convinced us that having a long-term home at Apache foundation would
> be a great fit for the project, to ensure that processes and procedures are
> in place to keep project and community ‘healthy’ and free of any
> commercial, political or legal faults.
>
> === Initial Goal ===
>
> Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure.
> Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF.
> Incremental development and releases along with Apache guidelines.
>
> == Current Status ==
>
> Pegasus has been an open-source project on GitHub
> https://github.com/XiaoMi/pegasus since October 2017.
>
> === Meritocracy ===
>
> The intent of this proposal is to start building a diverse developer and
> user community around Pegasus following the ASF meritocracy model. We plan
> to invite more people as committers if they contribute to this project.
>
> === Releases ===
>
> Pegasus has undergone multiple public releases, listed here:
> https://github.com/XiaoMi/pegasus/releases.
>
> These old releases were not performed in the typical ASF fashion. We will
> adopt the ASF source release process upon joining the incubator.
>
> === Code Reviews ===
>
> Pegasus’s code reviews are currently public on Github
> https://github.com/XiaoMi/pegasus/pulls.
>
> === Community ===
>
> Pegasus seeks to develop developer and user communities during incubation.
>
> === Core Developers ===
>
> Currently most of the core developers of Pegasus are working in the
> KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC
> members. Zuoyan Qin is an experienced open-source developer who created
> sofa-pbrpc in his last job in Baidu. Wei Huang is also an active
> contributor of Apache Doris (Incubating).
>
> - Zuoyan Qin (https://github.com/qinzuoyan)
> - Yuchen He 

Re: Re: [Proposal]New storage project: HBlock

2020-06-02 Thread zhangguochen
Hi Justin,

I am sorry for the delay feedback and thank you for your interest.
We have spent the time preparing more and working on HBlock. We would like
to enter the incubator but still need 2-more mentors to step forward.
Any volunteers to mentor HBlock and then we'd like a vote on our proposal to
start the journey to become Apache HBlock.

Best wishes.
Guochen.

-邮件原件-
发件人:
general-return-73210-zhangguochen=chinatelecom...@incubator.apache.org
 代
表 Justin Mclean
发送时间: 2020年5月24日 17:00
收件人: general@incubator.apache.org
主题: Re: [Proposal]New storage project: HBlock

Hi,

Just checking in on progress here and where this proposal is at.

Thanks,
Justin

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal]New storage project: HBlock

2020-05-24 Thread Justin Mclean
Hi,

Just checking in on progress here and where this proposal is at.

Thanks,
Justin

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal]New storage project: HBlock

2020-03-25 Thread Kevin A. McGrail
On 3/25/2020 5:39 PM, Ted Dunning wrote:
> On Wed, Mar 25, 2020 at 1:56 PM Kevin A. McGrail 
> wrote:
>
>> I have committed to champion and I think the points you make are good,
>> Ted.   Do you have the bandwidth to be a mentor?
>>
>  I don't have the time.
>
> I am interested in the project, but just can't afford the time and effort
> for mentoring a project like this that will need a lot of help and
> education. This would be even harder because of timezones. Most of my
> other-timezone meetings are with Europe (therefore early in my day). Adding
> Asia meetings and calls (necessary for education, I think) would mean
> burning both ends of the candle.

I understand but can't think of a better mentor for the project.  They
are looking for more mentors, btw, and I think HBlock presents a very
cool solution.  Anyone else willing to mentor them?

Regards,

KAM

-- 
Kevin A. McGrail
kmcgr...@apache.org

Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal]New storage project: HBlock

2020-03-25 Thread Ted Dunning
On Wed, Mar 25, 2020 at 1:56 PM Kevin A. McGrail 
wrote:

> I have committed to champion and I think the points you make are good,
> Ted.   Do you have the bandwidth to be a mentor?
>

 I don't have the time.

I am interested in the project, but just can't afford the time and effort
for mentoring a project like this that will need a lot of help and
education. This would be even harder because of timezones. Most of my
other-timezone meetings are with Europe (therefore early in my day). Adding
Asia meetings and calls (necessary for education, I think) would mean
burning both ends of the candle.


Re: [Proposal]New storage project: HBlock

2020-03-25 Thread Kevin A. McGrail
I have committed to champion and I think the points you make are good,
Ted.   Do you have the bandwidth to be a mentor?

I will work with them to set expectations about the process.  I have also
asked for them to do some community building now, too.
--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Wed, Mar 25, 2020 at 12:00 PM Ted Dunning  wrote:

> Three things are very clear to me:
>
> 1) having an open source iSCSI implementation from a mature and experienced
> storage stream is a very cool thing, especially if it can be targeted to
> non HDFS storage relatively easily. Building such a thing requires very
> high levels of experience and expertise that have generally been lacking in
> the open source world.
>
> 2) this team is very naive about the negative impacts that Apache processes
> will have on their development speed and will need lots of mentoring. Given
> their release schedule, I think that there are symmetrical risks, first
> that the team will be tempted to JFDI when getting features out the door
> rather than communicate and share designs and second that if they build a
> proper community overcoming language, timezone and large internal team
> dynamics that the internal political costs will severe due to slower
> development.
>
> 3) this team is very enthusiastic about making open source work and that
> might be enough to allow them to succeed in spite of the difficulties.
>
> The path to success here is, in my opinion, to require strong and engaged
> mentorship and make it very clear before they come in that Apache may not
> be a good fit due to the pressures they face to delivery on a schedule. If
> incubation with a high risk of exit back to a non-Apache form is acceptable
> to the project team, then it should be fine for Apache.
>
>
>
> On Mon, Mar 9, 2020 at 7:45 PM Sheng Wu  wrote:
>
> > Hi
> >
> > Personally, and basically, I am feeling the team has misunderstood
> > the meaning of incubator and the requirements of building the community.
> > Same as the last time discussion, I still think they will be in a big
> > pressure as they have to deal with the basic feature development,
> community
> > build and following ASF incubator requirements at the same time if they
> are
> > accepted into the incubator. And at the same time, the team lacks the
> > experiences of open source community in or out of ASF.
> > I am not sure whether this is good for the project. Seem like a little
> > hurry to join the incubator.
> > More Comments inline.
> >
> > Willing to listen to what other IPMCs think.
> >
> >  于2020年3月10日周二 上午10:21写道:
> >
> > > Hi, All,
> > >
> > > We are China Telecom Corporation Limited Cloud Computing Branch
> > > Corporation.
> > > We hope to contribute one of our projects named 'HBlock' to Apache.
> > > Here is the proposal of HBlock project, please feel free to let me know
> > > what
> > > the concerns and suggestions from you. Thank you so much.
> > >
> > > HBlock Proposal
> > >
> > > 1.Abstract
> > > The HBlock project will be an enterprise distributed block storage.
> > >
> > > 2.Proposal
> > > HBlock provides a distributed block storage with the following
> features:
> > > 2.1.User-space iSCSI target: HBlock will implement an iSCSI target that
> > is
> > > RFC-7143 (https://tools.ietf.org/html/rfc7143) compliant written in
> pure
> > > Java designed to run on top of any mainstream Operating System,
> including
> > > Windows and Linux, as a user-space process.
> > > 2.2.Enterprise level features: HBlock will implement comprehensive
> > > enterprise level features, such as
> > > Asymmetric Logical Unit Access (ALUA, Information technology -SCSI
> > Primary
> > > Commands - 4 (SPC-4),
> > https://www.t10.org/cgi-bin/ac.pl?t=f=spc4r37.pdf),
> > >
> > > Persistent Reservations (PR, Information technology -SCSI Primary
> > Commands
> > > -
> > > 4 (SPC-4), https://www.t10.org/cgi-bin/ac.pl?t=f=spc4r37.pdf),
> > > VMware vSphere Storage APIs - Array Integration(VAAI,
> > >
> > >
> >
> https://www.vmware.com/techpapers/2012/vmware-vsphere-storage-apis-array-int
> > > egration-10337.html
> > > <
> >
> https://www.vmware.com/techpapers/2012/vmware-vsphere-storage-apis-array-integration-10337.html
> > >
> > > ),
> > > Offloaded Data Transfer(ODX,
> > >
> > >
> >
> https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-se
> > > rver-2012-R2-and-2012/hh831628(v=ws.11)
> > > <
> >
> https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-R2-and-2012/hh831628(v=ws.11)
> > >),
> > > so that it will support
> > > session-level fail-over,
> > > Oracle Real Application Cluster(Oracle RAC,
> > > https://www.oracle.com/database/technologies/rac.html) ,
> > > Cluster File System (CFS), VMware cluster and Windows cluster.
> > > 2.3.Low latency: HBlock will implement in-memory distributed cache to
> > > reduce
> > > write latency and improve 

Re: [Proposal]New storage project: HBlock

2020-03-25 Thread Ted Dunning
Three things are very clear to me:

1) having an open source iSCSI implementation from a mature and experienced
storage stream is a very cool thing, especially if it can be targeted to
non HDFS storage relatively easily. Building such a thing requires very
high levels of experience and expertise that have generally been lacking in
the open source world.

2) this team is very naive about the negative impacts that Apache processes
will have on their development speed and will need lots of mentoring. Given
their release schedule, I think that there are symmetrical risks, first
that the team will be tempted to JFDI when getting features out the door
rather than communicate and share designs and second that if they build a
proper community overcoming language, timezone and large internal team
dynamics that the internal political costs will severe due to slower
development.

3) this team is very enthusiastic about making open source work and that
might be enough to allow them to succeed in spite of the difficulties.

The path to success here is, in my opinion, to require strong and engaged
mentorship and make it very clear before they come in that Apache may not
be a good fit due to the pressures they face to delivery on a schedule. If
incubation with a high risk of exit back to a non-Apache form is acceptable
to the project team, then it should be fine for Apache.



On Mon, Mar 9, 2020 at 7:45 PM Sheng Wu  wrote:

> Hi
>
> Personally, and basically, I am feeling the team has misunderstood
> the meaning of incubator and the requirements of building the community.
> Same as the last time discussion, I still think they will be in a big
> pressure as they have to deal with the basic feature development, community
> build and following ASF incubator requirements at the same time if they are
> accepted into the incubator. And at the same time, the team lacks the
> experiences of open source community in or out of ASF.
> I am not sure whether this is good for the project. Seem like a little
> hurry to join the incubator.
> More Comments inline.
>
> Willing to listen to what other IPMCs think.
>
>  于2020年3月10日周二 上午10:21写道:
>
> > Hi, All,
> >
> > We are China Telecom Corporation Limited Cloud Computing Branch
> > Corporation.
> > We hope to contribute one of our projects named 'HBlock' to Apache.
> > Here is the proposal of HBlock project, please feel free to let me know
> > what
> > the concerns and suggestions from you. Thank you so much.
> >
> > HBlock Proposal
> >
> > 1.Abstract
> > The HBlock project will be an enterprise distributed block storage.
> >
> > 2.Proposal
> > HBlock provides a distributed block storage with the following features:
> > 2.1.User-space iSCSI target: HBlock will implement an iSCSI target that
> is
> > RFC-7143 (https://tools.ietf.org/html/rfc7143) compliant written in pure
> > Java designed to run on top of any mainstream Operating System, including
> > Windows and Linux, as a user-space process.
> > 2.2.Enterprise level features: HBlock will implement comprehensive
> > enterprise level features, such as
> > Asymmetric Logical Unit Access (ALUA, Information technology -SCSI
> Primary
> > Commands - 4 (SPC-4),
> https://www.t10.org/cgi-bin/ac.pl?t=f=spc4r37.pdf),
> >
> > Persistent Reservations (PR, Information technology -SCSI Primary
> Commands
> > -
> > 4 (SPC-4), https://www.t10.org/cgi-bin/ac.pl?t=f=spc4r37.pdf),
> > VMware vSphere Storage APIs - Array Integration(VAAI,
> >
> >
> https://www.vmware.com/techpapers/2012/vmware-vsphere-storage-apis-array-int
> > egration-10337.html
> > <
> https://www.vmware.com/techpapers/2012/vmware-vsphere-storage-apis-array-integration-10337.html
> >
> > ),
> > Offloaded Data Transfer(ODX,
> >
> >
> https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-se
> > rver-2012-R2-and-2012/hh831628(v=ws.11)
> > <
> https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-R2-and-2012/hh831628(v=ws.11)
> >),
> > so that it will support
> > session-level fail-over,
> > Oracle Real Application Cluster(Oracle RAC,
> > https://www.oracle.com/database/technologies/rac.html) ,
> > Cluster File System (CFS), VMware cluster and Windows cluster.
> > 2.3.Low latency: HBlock will implement in-memory distributed cache to
> > reduce
> > write latency and improve Input / Output Operations Per Second (IOPS),
> and
> > it will leverage storage-class memory to archive even higher durability
> > without IOPS loss.
> > 2.4.Smart Compaction and Garbage Collection(GC): HBlock will convert all
> > the
> > write operations into sequential append operations to improve the random
> > write performance, and it will choose the best timing to compact and
> > collect
> > the garbage per Logic Unit (LU). Comparting to Solid State Drives (SSD's)
> > internal Garbage Collection, such a global GC will reduce the need of
> SSD's
> > internal GC, which indirectly make SSD have more usable space, and have
> > even
> > better GC strategy due to 

Re: [Proposal]New storage project: HBlock

2020-03-25 Thread Rich Bowen




On 3/9/20 10:45 PM, Sheng Wu wrote:

Hi

Personally, and basically, I am feeling the team has misunderstood
the meaning of incubator and the requirements of building the community.
Same as the last time discussion, I still think they will be in a big
pressure as they have to deal with the basic feature development, community
build and following ASF incubator requirements at the same time if they are
accepted into the incubator. And at the same time, the team lacks the
experiences of open source community in or out of ASF.


I find this remark confusing. Surely this is what the incubator is *for* 
- to learn about open source community at the ASF.


A strong community and "basic feature development" are not requirements 
for entering the Incubator. Rather, the incubator is the place for 
community building (among other things).



I noticed there are a lot of `will`s here in the Proposal section as the
project core features.
Are these language issues or all these features not available today?
Which parts have been implemented?



There is no requirement that a project be a completed product when it 
comes to the ASF. Indeed, as our friend Stefano Mazzocchi observed, all 
those years ago, coming in with a completed product makes it a lot 
harder to build a strong community, because there's nothing for them to do.


I readily admit that I've been away from the Incubator for some time, 
but surely we don't require projects to have a robust feature set and 
vibrant community as a requirement of entry. That would seem completely 
contrary to the entire point of coming here.


--
Rich Bowen - rbo...@rcbowen.com
http://rcbowen.com/
@rbowen

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal]New storage project: HBlock

2020-03-09 Thread Sheng Wu
Hi

Personally, and basically, I am feeling the team has misunderstood
the meaning of incubator and the requirements of building the community.
Same as the last time discussion, I still think they will be in a big
pressure as they have to deal with the basic feature development, community
build and following ASF incubator requirements at the same time if they are
accepted into the incubator. And at the same time, the team lacks the
experiences of open source community in or out of ASF.
I am not sure whether this is good for the project. Seem like a little
hurry to join the incubator.
More Comments inline.

Willing to listen to what other IPMCs think.

 于2020年3月10日周二 上午10:21写道:

> Hi, All,
>
> We are China Telecom Corporation Limited Cloud Computing Branch
> Corporation.
> We hope to contribute one of our projects named 'HBlock' to Apache.
> Here is the proposal of HBlock project, please feel free to let me know
> what
> the concerns and suggestions from you. Thank you so much.
>
> HBlock Proposal
>
> 1.Abstract
> The HBlock project will be an enterprise distributed block storage.
>
> 2.Proposal
> HBlock provides a distributed block storage with the following features:
> 2.1.User-space iSCSI target: HBlock will implement an iSCSI target that is
> RFC-7143 (https://tools.ietf.org/html/rfc7143) compliant written in pure
> Java designed to run on top of any mainstream Operating System, including
> Windows and Linux, as a user-space process.
> 2.2.Enterprise level features: HBlock will implement comprehensive
> enterprise level features, such as
> Asymmetric Logical Unit Access (ALUA, Information technology -SCSI Primary
> Commands - 4 (SPC-4), https://www.t10.org/cgi-bin/ac.pl?t=f=spc4r37.pdf),
>
> Persistent Reservations (PR, Information technology -SCSI Primary Commands
> -
> 4 (SPC-4), https://www.t10.org/cgi-bin/ac.pl?t=f=spc4r37.pdf),
> VMware vSphere Storage APIs - Array Integration(VAAI,
>
> https://www.vmware.com/techpapers/2012/vmware-vsphere-storage-apis-array-int
> egration-10337.html
> 
> ),
> Offloaded Data Transfer(ODX,
>
> https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-se
> rver-2012-R2-and-2012/hh831628(v=ws.11)
> ),
> so that it will support
> session-level fail-over,
> Oracle Real Application Cluster(Oracle RAC,
> https://www.oracle.com/database/technologies/rac.html) ,
> Cluster File System (CFS), VMware cluster and Windows cluster.
> 2.3.Low latency: HBlock will implement in-memory distributed cache to
> reduce
> write latency and improve Input / Output Operations Per Second (IOPS), and
> it will leverage storage-class memory to archive even higher durability
> without IOPS loss.
> 2.4.Smart Compaction and Garbage Collection(GC): HBlock will convert all
> the
> write operations into sequential append operations to improve the random
> write performance, and it will choose the best timing to compact and
> collect
> the garbage per Logic Unit (LU). Comparting to Solid State Drives (SSD's)
> internal Garbage Collection, such a global GC will reduce the need of SSD's
> internal GC, which indirectly make SSD have more usable space, and have
> even
> better GC strategy due to close to application. In essence, flash writes
> data in block (32MB) order. In order to realize random write, SSD disk will
> reserve a part of space for GC in the disk. Therefore, the more random
> write
> and delete, the more space needs to be reserved. HDFS based writes are
> sequential for SSD, so the space reserved in SSD is small. In short, as
> long
> as there is a GC, there must be reserved space, either in the HBlock layer
> or in the controller layer inside the SSD. Because HBlock is closer to LU,
> it can be more efficient GC. For example, a LU dedicated to video
> monitoring
> data basically writes video data in sequence, and starts writing again when
> the disk is full. This LU does not need any GC at all. If you do GC in the
> SSD layer, SSD will see the data of various LUs, and unnecessary movement
> will be made to the LU dedicated for video monitoring.
> 2.5.Hadoop Distributed File System (HDFS)-based: HBlock leverages HDFS a as
> persistent layer to avoid reinventing wheels. The iSCSI target will run on
> the client side of HDFS and directly read or write data from or to Data
> Nodes.
> 2.6.Easy to deploy: HBlock will provide easy-to-use utilities to make the
> installation process extremely easy. Since HBlock does not rely on any
> Operating System, deployment is easy unlike other storage systems that rely
> on in-kernel iSCSI module, such as Linux-IO (LIO), or SCST.
>

I noticed there are a lot of `will`s here in the Proposal section as the
project core features.
Are these language issues or all these features not available today?
Which parts have been implemented?



Re: [PROPOSAL] sparklyr

2019-10-26 Thread Kevin Kuo
A big thanks to all that have left feedback! After much deliberation, we
have decided to withdraw this proposal for the time being. The questions
around licenses are delicate, and we are currently not ready to navigate
them.

Cheers,
Kevin

On Mon, Oct 21, 2019 at 11:52 PM 申远  wrote:

> You could also read the documentation[1] here about what license is allowed
> in ASF project.
>
> [1] https://apache.org/legal/resolved.html#category-a
>
> Best Regards,
> YorkShen
>
> 申远
>
>
> 申远  于2019年10月22日周二 下午2:49写道:
>
> > Base on my experience (wearing my Apache Weex's hat),  GPL/LGPL
> dependency
> > is not compatible with ASF's policy, and you may want to fix the License
> > problem at the beginning, even before into Incubator. Otherwise, GPL/LGPL
> > dependency will give you a lot of pain than you'd ever expect.
> >
> > Best Regards,
> > YorkShen
> >
> > 申远
> >
> >
> > Javier Luraschi  于2019年10月22日周二 上午2:55写道:
> >
> >> Regarding licenses, dplyr is under MIT, see:
> >> https://github.com/tidyverse/dplyr/blob/master/LICENSE.md. However,
> other
> >> packages are under GPL2.
> >>
> >> Here are all the packages that sparklyr currently depends on and their
> >> associated license (This was retrieved from
> >> https://CRAN.R-project.org/package=, since R package repo
> >> (CRAN) requires their license to be clearly defined).
> >>
> >> assertthat: GPL-3
> >> base64enc: GPL-2 | GPL-3
> >> config: GPL-3
> >> DBI: LGPL-2 | LGPL-2.1 | LGPL-3
> >> dplyr: MIT
> >> dbplyr: MIT
> >> digest: GPL-2 | GPL-3
> >> forge: Apache
> >> generics: GPL-2
> >> httr: MIT
> >> jsonlite: MIT
> >> openssl: MIT
> >> purrr: GPL-3
> >> r2d3: BSD-3
> >> rappdirs: MIT
> >> rlang: GPL-3
> >> rprojroot: GPL-3
> >> rstudioapi: MIT
> >> tibble: MIT
> >> tidyr: MIT
> >> withr: GPL-2 | GPL-3
> >> xml2: GPL-2 | GPL-3
> >> ellipsis: GPL-3
> >>
> >>
> >> On Mon, Oct 21, 2019 at 1:12 AM Justin Mclean  >
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > I also concerned that the initial committer list only contains 3
> >> > committers. Why have you not included others in the community that
> have
> >> > made contributions?
> >> >
> >> > I don’t know if this is an issue or not but bring it up just in case
> you
> >> > not aware. I can see that some of the tidyverse packages are under
> GPL2,
> >> > the GPL license is not compatible with the ALv2. I’m not 100% sure
> what
> >> > license dplyr is under. I can see that sparkly depends on several
> (10+)
> >> GPL
> >> > licensed pieces of software. Do you see this causing any issue as GPL
> >> code
> >> > can’t be included in an Apache source release and can’t be a
> >> non-optional
> >> > dependancy of an ASF project. Have you discussed this with your
> >> champion or
> >> > proposed mentors and have they flagged this as a possible issue?
> >> >
> >> > I can see that one of the proposed mentors is not an IPMC member
> (which
> >> is
> >> > required) and another seems not very active in signing off reports or
> >> > voting on releases. Did you think the existing mentors will provide
> your
> >> > project with enough support?
> >> >
> >> > Thanks,
> >> > Justin
> >> >
> >> > 1. https://github.com/tidyverse/dplyr/blob/master/LICENSE
> >> > -
> >> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >> > For additional commands, e-mail: general-h...@incubator.apache.org
> >> >
> >> >
> >>
> >
>


Re: [PROPOSAL] sparklyr

2019-10-22 Thread 申远
You could also read the documentation[1] here about what license is allowed
in ASF project.

[1] https://apache.org/legal/resolved.html#category-a

Best Regards,
YorkShen

申远


申远  于2019年10月22日周二 下午2:49写道:

> Base on my experience (wearing my Apache Weex's hat),  GPL/LGPL dependency
> is not compatible with ASF's policy, and you may want to fix the License
> problem at the beginning, even before into Incubator. Otherwise, GPL/LGPL
> dependency will give you a lot of pain than you'd ever expect.
>
> Best Regards,
> YorkShen
>
> 申远
>
>
> Javier Luraschi  于2019年10月22日周二 上午2:55写道:
>
>> Regarding licenses, dplyr is under MIT, see:
>> https://github.com/tidyverse/dplyr/blob/master/LICENSE.md. However, other
>> packages are under GPL2.
>>
>> Here are all the packages that sparklyr currently depends on and their
>> associated license (This was retrieved from
>> https://CRAN.R-project.org/package=, since R package repo
>> (CRAN) requires their license to be clearly defined).
>>
>> assertthat: GPL-3
>> base64enc: GPL-2 | GPL-3
>> config: GPL-3
>> DBI: LGPL-2 | LGPL-2.1 | LGPL-3
>> dplyr: MIT
>> dbplyr: MIT
>> digest: GPL-2 | GPL-3
>> forge: Apache
>> generics: GPL-2
>> httr: MIT
>> jsonlite: MIT
>> openssl: MIT
>> purrr: GPL-3
>> r2d3: BSD-3
>> rappdirs: MIT
>> rlang: GPL-3
>> rprojroot: GPL-3
>> rstudioapi: MIT
>> tibble: MIT
>> tidyr: MIT
>> withr: GPL-2 | GPL-3
>> xml2: GPL-2 | GPL-3
>> ellipsis: GPL-3
>>
>>
>> On Mon, Oct 21, 2019 at 1:12 AM Justin Mclean 
>> wrote:
>>
>> > Hi,
>> >
>> > I also concerned that the initial committer list only contains 3
>> > committers. Why have you not included others in the community that have
>> > made contributions?
>> >
>> > I don’t know if this is an issue or not but bring it up just in case you
>> > not aware. I can see that some of the tidyverse packages are under GPL2,
>> > the GPL license is not compatible with the ALv2. I’m not 100% sure what
>> > license dplyr is under. I can see that sparkly depends on several (10+)
>> GPL
>> > licensed pieces of software. Do you see this causing any issue as GPL
>> code
>> > can’t be included in an Apache source release and can’t be a
>> non-optional
>> > dependancy of an ASF project. Have you discussed this with your
>> champion or
>> > proposed mentors and have they flagged this as a possible issue?
>> >
>> > I can see that one of the proposed mentors is not an IPMC member (which
>> is
>> > required) and another seems not very active in signing off reports or
>> > voting on releases. Did you think the existing mentors will provide your
>> > project with enough support?
>> >
>> > Thanks,
>> > Justin
>> >
>> > 1. https://github.com/tidyverse/dplyr/blob/master/LICENSE
>> > -
>> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> > For additional commands, e-mail: general-h...@incubator.apache.org
>> >
>> >
>>
>


Re: [PROPOSAL] sparklyr

2019-10-22 Thread 申远
Base on my experience (wearing my Apache Weex's hat),  GPL/LGPL dependency
is not compatible with ASF's policy, and you may want to fix the License
problem at the beginning, even before into Incubator. Otherwise, GPL/LGPL
dependency will give you a lot of pain than you'd ever expect.

Best Regards,
YorkShen

申远


Javier Luraschi  于2019年10月22日周二 上午2:55写道:

> Regarding licenses, dplyr is under MIT, see:
> https://github.com/tidyverse/dplyr/blob/master/LICENSE.md. However, other
> packages are under GPL2.
>
> Here are all the packages that sparklyr currently depends on and their
> associated license (This was retrieved from
> https://CRAN.R-project.org/package=, since R package repo
> (CRAN) requires their license to be clearly defined).
>
> assertthat: GPL-3
> base64enc: GPL-2 | GPL-3
> config: GPL-3
> DBI: LGPL-2 | LGPL-2.1 | LGPL-3
> dplyr: MIT
> dbplyr: MIT
> digest: GPL-2 | GPL-3
> forge: Apache
> generics: GPL-2
> httr: MIT
> jsonlite: MIT
> openssl: MIT
> purrr: GPL-3
> r2d3: BSD-3
> rappdirs: MIT
> rlang: GPL-3
> rprojroot: GPL-3
> rstudioapi: MIT
> tibble: MIT
> tidyr: MIT
> withr: GPL-2 | GPL-3
> xml2: GPL-2 | GPL-3
> ellipsis: GPL-3
>
>
> On Mon, Oct 21, 2019 at 1:12 AM Justin Mclean 
> wrote:
>
> > Hi,
> >
> > I also concerned that the initial committer list only contains 3
> > committers. Why have you not included others in the community that have
> > made contributions?
> >
> > I don’t know if this is an issue or not but bring it up just in case you
> > not aware. I can see that some of the tidyverse packages are under GPL2,
> > the GPL license is not compatible with the ALv2. I’m not 100% sure what
> > license dplyr is under. I can see that sparkly depends on several (10+)
> GPL
> > licensed pieces of software. Do you see this causing any issue as GPL
> code
> > can’t be included in an Apache source release and can’t be a non-optional
> > dependancy of an ASF project. Have you discussed this with your champion
> or
> > proposed mentors and have they flagged this as a possible issue?
> >
> > I can see that one of the proposed mentors is not an IPMC member (which
> is
> > required) and another seems not very active in signing off reports or
> > voting on releases. Did you think the existing mentors will provide your
> > project with enough support?
> >
> > Thanks,
> > Justin
> >
> > 1. https://github.com/tidyverse/dplyr/blob/master/LICENSE
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>


Re: [PROPOSAL] sparklyr

2019-10-21 Thread Javier Luraschi
Regarding licenses, dplyr is under MIT, see:
https://github.com/tidyverse/dplyr/blob/master/LICENSE.md. However, other
packages are under GPL2.

Here are all the packages that sparklyr currently depends on and their
associated license (This was retrieved from
https://CRAN.R-project.org/package=, since R package repo
(CRAN) requires their license to be clearly defined).

assertthat: GPL-3
base64enc: GPL-2 | GPL-3
config: GPL-3
DBI: LGPL-2 | LGPL-2.1 | LGPL-3
dplyr: MIT
dbplyr: MIT
digest: GPL-2 | GPL-3
forge: Apache
generics: GPL-2
httr: MIT
jsonlite: MIT
openssl: MIT
purrr: GPL-3
r2d3: BSD-3
rappdirs: MIT
rlang: GPL-3
rprojroot: GPL-3
rstudioapi: MIT
tibble: MIT
tidyr: MIT
withr: GPL-2 | GPL-3
xml2: GPL-2 | GPL-3
ellipsis: GPL-3


On Mon, Oct 21, 2019 at 1:12 AM Justin Mclean 
wrote:

> Hi,
>
> I also concerned that the initial committer list only contains 3
> committers. Why have you not included others in the community that have
> made contributions?
>
> I don’t know if this is an issue or not but bring it up just in case you
> not aware. I can see that some of the tidyverse packages are under GPL2,
> the GPL license is not compatible with the ALv2. I’m not 100% sure what
> license dplyr is under. I can see that sparkly depends on several (10+) GPL
> licensed pieces of software. Do you see this causing any issue as GPL code
> can’t be included in an Apache source release and can’t be a non-optional
> dependancy of an ASF project. Have you discussed this with your champion or
> proposed mentors and have they flagged this as a possible issue?
>
> I can see that one of the proposed mentors is not an IPMC member (which is
> required) and another seems not very active in signing off reports or
> voting on releases. Did you think the existing mentors will provide your
> project with enough support?
>
> Thanks,
> Justin
>
> 1. https://github.com/tidyverse/dplyr/blob/master/LICENSE
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [PROPOSAL] sparklyr

2019-10-21 Thread larry mccay
This looks interesting to me.
I would be willing to contribute, if you would like to add me to the
initial list of committers.


On Mon, Oct 21, 2019 at 10:50 AM Matt Sicker  wrote:

> A lot of core R libraries seem to be under GPL. If we build more R
> projects at Apache, it seems like we may need more Apache-licensed (or
> compatible) libraries in R.
>
> On Mon, 21 Oct 2019 at 03:12, Justin Mclean 
> wrote:
> >
> > Hi,
> >
> > I also concerned that the initial committer list only contains 3
> committers. Why have you not included others in the community that have
> made contributions?
> >
> > I don’t know if this is an issue or not but bring it up just in case you
> not aware. I can see that some of the tidyverse packages are under GPL2,
> the GPL license is not compatible with the ALv2. I’m not 100% sure what
> license dplyr is under. I can see that sparkly depends on several (10+) GPL
> licensed pieces of software. Do you see this causing any issue as GPL code
> can’t be included in an Apache source release and can’t be a non-optional
> dependancy of an ASF project. Have you discussed this with your champion or
> proposed mentors and have they flagged this as a possible issue?
> >
> > I can see that one of the proposed mentors is not an IPMC member (which
> is required) and another seems not very active in signing off reports or
> voting on releases. Did you think the existing mentors will provide your
> project with enough support?
> >
> > Thanks,
> > Justin
> >
> > 1. https://github.com/tidyverse/dplyr/blob/master/LICENSE
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
>
>
> --
> Matt Sicker 
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [PROPOSAL] sparklyr

2019-10-21 Thread Matt Sicker
A lot of core R libraries seem to be under GPL. If we build more R
projects at Apache, it seems like we may need more Apache-licensed (or
compatible) libraries in R.

On Mon, 21 Oct 2019 at 03:12, Justin Mclean  wrote:
>
> Hi,
>
> I also concerned that the initial committer list only contains 3 committers. 
> Why have you not included others in the community that have made 
> contributions?
>
> I don’t know if this is an issue or not but bring it up just in case you not 
> aware. I can see that some of the tidyverse packages are under GPL2, the GPL 
> license is not compatible with the ALv2. I’m not 100% sure what license dplyr 
> is under. I can see that sparkly depends on several (10+) GPL licensed pieces 
> of software. Do you see this causing any issue as GPL code can’t be included 
> in an Apache source release and can’t be a non-optional dependancy of an ASF 
> project. Have you discussed this with your champion or proposed mentors and 
> have they flagged this as a possible issue?
>
> I can see that one of the proposed mentors is not an IPMC member (which is 
> required) and another seems not very active in signing off reports or voting 
> on releases. Did you think the existing mentors will provide your project 
> with enough support?
>
> Thanks,
> Justin
>
> 1. https://github.com/tidyverse/dplyr/blob/master/LICENSE
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>


-- 
Matt Sicker 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] sparklyr

2019-10-21 Thread Justin Mclean
Hi,

I also concerned that the initial committer list only contains 3 committers. 
Why have you not included others in the community that have made contributions?

I don’t know if this is an issue or not but bring it up just in case you not 
aware. I can see that some of the tidyverse packages are under GPL2, the GPL 
license is not compatible with the ALv2. I’m not 100% sure what license dplyr 
is under. I can see that sparkly depends on several (10+) GPL licensed pieces 
of software. Do you see this causing any issue as GPL code can’t be included in 
an Apache source release and can’t be a non-optional dependancy of an ASF 
project. Have you discussed this with your champion or proposed mentors and 
have they flagged this as a possible issue?

I can see that one of the proposed mentors is not an IPMC member (which is 
required) and another seems not very active in signing off reports or voting on 
releases. Did you think the existing mentors will provide your project with 
enough support?

Thanks,
Justin

1. https://github.com/tidyverse/dplyr/blob/master/LICENSE
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] sparklyr

2019-10-19 Thread Jean-Baptiste Onofré
Hi,

it's an interesting proposal.

I guess that one of your challenge during the incubation is to extend
the community (only 3 initial committers is very low) and extend the
diversity (two companies affiliation).

Regards
JB

On 19/10/2019 17:53, Kevin Kuo wrote:
> Greetings!
> 
> We are proposing to enter sparklyr (https://spark.rstudio.com/), an open
> source R package for interfacing with Apache Spark, into incubation. Please
> see the proposal below.
> 
> ==
> 
> = Abstract =
> 
> sparklyr is an open source R package providing an interface to Apache
> Spark, a system for large-scale data analysis on clusters. It provides a
> dplyr interface for manipulating Spark DataFrames, supports the Spark ML
> and Structured Streaming components, and offers a developer API to create
> extensions.
> 
> = Proposal =
> 
> The sparklyr project, along with the ecosystem of extensions it supports,
> aims to democratize the capabilities of Apache Spark for R users, who
> represent a significant portion of data scientists today. The API is
> designed to reduce friction for users transitioning from local, “small
> data” workflows to computing on clusters, while preserving the flexibility
> of Apache Spark as much as possible. Some features include:
> 
> - It is compatible with the tidyverse ecosystem of packages, which is a
> popular collection of libraries for data science in R. Specifically, one
> can use `dplyr` verbs to manipulate Spark DataFrames. However, one can also
> use sparklyr without using tidyverse packages.
> - It features an extensions API that allows users to easily wrap existing
> Spark packages written in Scala. This has enabled the development of
> sparkxgb (interface for xgboost4j), graphframes (interface for
> GraphFrames), mleap (interface for MLeap), and sparktf (interface for Spark
> TensorFlow connector), to name a few.
> 
> = Rationale =
> 
> By becoming an Apache project, sparklyr can better align with the Apache
> Spark project, and encourage stronger collaboration among users and
> contributors in the R and Apache communities. Culturally, sparklyr is also
> a good fit for ASF: the development of the project has adhered to the
> Apache way since inception, and the current contributors are committed to
> upholding those values.
> 
> = Initial Goals =
> 
> The initial goals will be to move the existing codebase to Apache and the
> documentation from the RStudio domain to Apache.
> 
> = Current Status =
> 
> == Meritocracy ==
> 
> The sparklyr project has operated on meritocratic principles since
> inception. We have accepted major patches from developers outside RStudio,
> and have operated with the implicit expectation that contributors to major
> features maintain those features.
> 
> == Community ==
> 
> The sparklyr project currently has 699 stars on GitHub, 52 direct
> contributors, ~1,400 issues (approximately 500 of those are open), and
> approximately 194,000 downloads from CRAN each month. The documentation
> website spark.rstudio.com achieves ~15k visitors per month. There are also
> more than 15 open source extensions written that implement features such as
> genomic analysis and interoperability with databases.
> 
> = Known Risks =
> 
> == Reliance on Salaried Developers ==
> 
> sparklyr is currently maintained by salaried developers at RStudio and
> receives some ongoing contributions from the community, although all
> committers are employed by RStudio. We hope that by becoming an Apache
> project, the project will garner additional developer interest and expand
> the diversity of committers.
> 
> = Documentation =
> 
> Documentation of the project can be found at https://spark.rstudio.com/ and
> https://cran.r-project.org/web/packages/sparklyr/sparklyr.pdf. There is
> also a free online book, available at https://therinspark.com/, that can be
> used as a reference.
> 
> = Initial Source =
> 
> The sparklyr codebase is currently hosted on GitHub:
> https://github.com/rstudio/sparklyr. sparklyr has been Apache 2.0 licensed
> since inception. RStudio currently maintains CLAs from all significant
> contributors. RStudio does not own the copyright of sparklyr and it is not
> a trademark.
> 
> = External Dependencies =
> 
> We remark that `sparklyr` imports some R packages that are not
> Apache-compatible licensed; however, these packages are not distributed
> with the project. Note, for example, R itself is GPLv2 licensed.
> 
> = Required Resources =
> 
> - Mailing lists: {users, dev, commits}@sparklyr.incubator.apache.org
> - GitHub repo
> - If possible, we would like to continue using GitHub for issue tracking,
> as it is much more familiar to the R community than JIRA.
> 
> = Project Name =
> 
> There is sufficient goodwill built around the package so we would like to
> keep the name. sparklyr is pronounced spark-lee-R, i.e. does not rhyme with
> the data manipulation package dplyr, and is never capitalized. Incorrect
> spellings include SparklyR and sparklyR.
> 
> = Initial 

Re: [PROPOSAL] sparklyr

2019-10-19 Thread Dave Fisher
Hi -

An interesting proposal. I am concerned about the very small size of the 
Initial Committer list with 3 individuals one of whom I only see small 
contributions from on https://github.com/rstudio/sparklyr/graphs/contributors

Do the Mentors intend to be active participants in the community?

Also, Sean will need to join the IPMC which is easy for him to request.

Regards,
Dave

> On Oct 19, 2019, at 8:53 AM, Kevin Kuo  wrote:
> 
> Greetings!
> 
> We are proposing to enter sparklyr (https://spark.rstudio.com/), an open
> source R package for interfacing with Apache Spark, into incubation. Please
> see the proposal below.
> 
> ==
> 
> = Abstract =
> 
> sparklyr is an open source R package providing an interface to Apache
> Spark, a system for large-scale data analysis on clusters. It provides a
> dplyr interface for manipulating Spark DataFrames, supports the Spark ML
> and Structured Streaming components, and offers a developer API to create
> extensions.
> 
> = Proposal =
> 
> The sparklyr project, along with the ecosystem of extensions it supports,
> aims to democratize the capabilities of Apache Spark for R users, who
> represent a significant portion of data scientists today. The API is
> designed to reduce friction for users transitioning from local, “small
> data” workflows to computing on clusters, while preserving the flexibility
> of Apache Spark as much as possible. Some features include:
> 
> - It is compatible with the tidyverse ecosystem of packages, which is a
> popular collection of libraries for data science in R. Specifically, one
> can use `dplyr` verbs to manipulate Spark DataFrames. However, one can also
> use sparklyr without using tidyverse packages.
> - It features an extensions API that allows users to easily wrap existing
> Spark packages written in Scala. This has enabled the development of
> sparkxgb (interface for xgboost4j), graphframes (interface for
> GraphFrames), mleap (interface for MLeap), and sparktf (interface for Spark
> TensorFlow connector), to name a few.
> 
> = Rationale =
> 
> By becoming an Apache project, sparklyr can better align with the Apache
> Spark project, and encourage stronger collaboration among users and
> contributors in the R and Apache communities. Culturally, sparklyr is also
> a good fit for ASF: the development of the project has adhered to the
> Apache way since inception, and the current contributors are committed to
> upholding those values.
> 
> = Initial Goals =
> 
> The initial goals will be to move the existing codebase to Apache and the
> documentation from the RStudio domain to Apache.
> 
> = Current Status =
> 
> == Meritocracy ==
> 
> The sparklyr project has operated on meritocratic principles since
> inception. We have accepted major patches from developers outside RStudio,
> and have operated with the implicit expectation that contributors to major
> features maintain those features.
> 
> == Community ==
> 
> The sparklyr project currently has 699 stars on GitHub, 52 direct
> contributors, ~1,400 issues (approximately 500 of those are open), and
> approximately 194,000 downloads from CRAN each month. The documentation
> website spark.rstudio.com achieves ~15k visitors per month. There are also
> more than 15 open source extensions written that implement features such as
> genomic analysis and interoperability with databases.
> 
> = Known Risks =
> 
> == Reliance on Salaried Developers ==
> 
> sparklyr is currently maintained by salaried developers at RStudio and
> receives some ongoing contributions from the community, although all
> committers are employed by RStudio. We hope that by becoming an Apache
> project, the project will garner additional developer interest and expand
> the diversity of committers.
> 
> = Documentation =
> 
> Documentation of the project can be found at https://spark.rstudio.com/ and
> https://cran.r-project.org/web/packages/sparklyr/sparklyr.pdf. There is
> also a free online book, available at https://therinspark.com/, that can be
> used as a reference.
> 
> = Initial Source =
> 
> The sparklyr codebase is currently hosted on GitHub:
> https://github.com/rstudio/sparklyr. sparklyr has been Apache 2.0 licensed
> since inception. RStudio currently maintains CLAs from all significant
> contributors. RStudio does not own the copyright of sparklyr and it is not
> a trademark.
> 
> = External Dependencies =
> 
> We remark that `sparklyr` imports some R packages that are not
> Apache-compatible licensed; however, these packages are not distributed
> with the project. Note, for example, R itself is GPLv2 licensed.
> 
> = Required Resources =
> 
> - Mailing lists: {users, dev, commits}@sparklyr.incubator.apache.org
> - GitHub repo
> - If possible, we would like to continue using GitHub for issue tracking,
> as it is much more familiar to the R community than JIRA.
> 
> = Project Name =
> 
> There is sufficient goodwill built around the package so we would like to
> keep the name. sparklyr is pronounced 

Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-09-05 Thread ซ่อยค่อย ลืมเขาแน่
ในวันที่ ศ. 6 ก.ย. 2019 01:25 Nygard, Carl J 
เขียนว่า:

>
>
>
>
> Hi,
>
> > > Cengage wishes to donate the code to Apache, but is unable to do so
> until the project is accepted as an Incubator project.
>
> > Could you expand a little on why is that?
>
> Perhaps it is an issue of understanding the path to Incubator and trying
> to deal with the corporate policies regarding open-source donation.
> Donation to Apache seems to be the clearest path with the least number of
> obstacles.
>
> > > The Apache community members looking to approve the project
> understandably would like to better understand what they are approving and
> so would like to get access to the code which is to be donated.
>
> > I don’t see this a being a big issue, the other code can be donated
> later, but may complicate things in terms of SGAs, ICLAs and IP provenance.
>
> Can you expand on this?  Is there a different process for donating a large
> body of work to an existing Incubator project?
>
> --carl
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-09-05 Thread Dave Fisher
HI

> On Sep 5, 2019, at 11:25 AM, Nygard, Carl J  wrote:

> Hi,
> 
>>> Cengage wishes to donate the code to Apache, but is unable to do so until 
>>> the project is accepted as an Incubator project.
> 
>> Could you expand a little on why is that?
> 
> Perhaps it is an issue of understanding the path to Incubator and trying to 
> deal with the corporate policies regarding open-source donation.  Donation to 
> Apache seems to be the clearest path with the least number of obstacles.

Agreed. The concern is that should the SGA not happen due to internal issues at 
the donating company that podling will have trouble succeeding. We’ve seen that 
with a podling that took over 6 months to get an SGA and essentially lost all 
its mentors.
> 
>>> The Apache community members looking to approve the project understandably 
>>> would like to better understand what they are approving and so would like 
>>> to get access to the code which is to be donated.
> 
>> I don’t see this a being a big issue, the other code can be donated later, 
>> but may complicate things in terms of SGAs, ICLAs and IP provenance.

The issue is seeing the code to check to see if there are dependencies that 
don’t fit with Apache Release Policy. Of course this is also a part of 
Incubation.

> 
> Can you expand on this?  Is there a different process for donating a large 
> body of work to an existing Incubator project?  

No - such a donation would be an additional SGA and/or CLA.

I have not looked deeply at your proposal, but I think the concern would be who 
your mentors are and whether or not the IPMC thinks that these issues will be 
handled properly.

(But I’m just another member of the IPMC and not Justin)

Regards,
Dave

> 
> --carl
> 
> 
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-09-05 Thread Nygard, Carl J




Hi,

> > Cengage wishes to donate the code to Apache, but is unable to do so until 
> > the project is accepted as an Incubator project.

> Could you expand a little on why is that?

Perhaps it is an issue of understanding the path to Incubator and trying to 
deal with the corporate policies regarding open-source donation.  Donation to 
Apache seems to be the clearest path with the least number of obstacles.

> > The Apache community members looking to approve the project understandably 
> > would like to better understand what they are approving and so would like 
> > to get access to the code which is to be donated.

> I don’t see this a being a big issue, the other code can be donated later, 
> but may complicate things in terms of SGAs, ICLAs and IP provenance.

Can you expand on this?  Is there a different process for donating a large body 
of work to an existing Incubator project?  

--carl


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-09-03 Thread Justin Mclean
Hi,

> Cengage wishes to donate the code to Apache, but is unable to do so until the 
> project is accepted as an Incubator project. 

Could you expand a little on why is that?

> The Apache community members looking to approve the project understandably 
> would like to better understand what they are approving and so would like to 
> get access to the code which is to be donated.

I don’t see this a being a big issue, the other code can be donated later, but 
may complicate things in terms of SGAs, ICLAs and IP provenance.

Thanks,
Justin
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-09-03 Thread Nygard, Carl J
Justin,

I'm running into a bit of chicken/egg problem in terms of process for donating 
source code.  

To frame the problem:  Part of the code is already released as open-source, 
which is available in the github repository.  Part of the code is yet to be 
released since it was developed internally by Cengage.  Cengage wishes to 
donate the code to Apache, but is unable to do so until the project is accepted 
as an Incubator project.  The Apache community members looking to approve the 
project understandably would like to better understand what they are approving 
and so would like to get access to the code which is to be donated.

I'm sure you've seen similar issues in the past and I'm looking for some 
advice/options to be able to move forward.

--carl



From: Justin Mclean 
Sent: Tuesday, July 30, 2019 4:35 AM
To: general@incubator.apache.org 
Subject: Re: [PROPOSAL] MetaObjects for Apache Incubator

 Hi,

> Currently most development is happening within Cengage.  The core library 
> hasn't changed much, aside from some bug fixes.  Most of the recent 
> innovation has occurred in other libraries built upon the draagon-metaobjects 
> foundations.  That code is part of
 what is proposed to be donated to the apache-metaobjects project.

Thanks for that, I still a little unclear what code base is being donated. I’m 
just asking as that may make the IP transfers process easier or harder, you 
might have to get ICLA from people who have worked on it in the past for 
instance.

> Regarding the mentor list, I saw the Apache committer list included Heath, 
> but misunderstood that being a committer did not imply ASF membership, which 
> was my mistake.  However, Johan Edstrom and James Carman are members of the 
> IPMC, and Jamie mentioned to
 me that he has requested IPMC membership as well.  In addition, James has 
mentored a few projects in the past.  We would welcome another mentor if anyone 
is interested in helping build a community around the project.

I think it would be best for a project if you had an mentor with some more 
experience. It can be difficult if you mentors go missing or if they may not be 
familiar with recent changes in infrastructure or the incubator.

Thanks,
Justin

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org




-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-08-19 Thread Nygard, Carl J
Julian,

I'm currently working through the details of releasing the code with our legal 
department.  Clearly this has taken longer than I anticipated, given that we 
had fundamental agreement to release before I started the process.  Who knew?  
;)

I'll post to the list a link to the latest set of code when I have it available.

Until then, I'm happy to discuss your use-case and see how it fits into the 
project vision.  Can you give an example of the type of code generation or 
model-driven design you want to achieve?

--carl


From: Julian Feinauer 
Sent: Wednesday, July 31, 2019 1:40 AM
To: general@incubator.apache.org 
Subject: AW: [PROPOSAL] MetaObjects for Apache Incubator
 
Hi Carl,

The project sounds interesting and as I understand it the aim is code 
generation from models.
Is this right?

It would be interesting to see the current version of the code base to get a 
better feeling on that.

We use and want to use code generation heavily in the apache plc4x project and 
I would love to have another project where such efforts are driven aside from 
doing it all by ourselves.

Best
Julian

Von meinem Mobiltelefon gesendet


 Ursprüngliche Nachricht 
Betreff: Re: [PROPOSAL] MetaObjects for Apache Incubator
Von: Justin Mclean
An: general@incubator.apache.org
Cc:

Hi,

> Currently most development is happening within Cengage.  The core library 
> hasn't changed much, aside from some bug fixes.  Most of the recent 
> innovation has occurred in other libraries built upon the draagon-metaobjects 
> foundations.  That code is part of
 what is proposed to be donated to the apache-metaobjects project.

Thanks for that, I still a little unclear what code base is being donated. I’m 
just asking as that may make the IP transfers process easier or harder, you 
might have to get ICLA from people who have worked on it in the past for 
instance.

> Regarding the mentor list, I saw the Apache committer list included Heath, 
> but misunderstood that being a committer did not imply ASF membership, which 
> was my mistake.  However, Johan Edstrom and James Carman are members of the 
> IPMC, and Jamie mentioned to
 me that he has requested IPMC membership as well.  In addition, James has 
mentored a few projects in the past.  We would welcome another mentor if anyone 
is interested in helping build a community around the project.

I think it would be best for a project if you had an mentor with some more 
experience. It can be difficult if you mentors go missing or if they may not be 
familiar with recent changes in infrastructure or the incubator.

Thanks,
Justin
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org




-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-07-30 Thread Justin Mclean
Hi,

> Currently most development is happening within Cengage.  The core library 
> hasn't changed much, aside from some bug fixes.  Most of the recent 
> innovation has occurred in other libraries built upon the draagon-metaobjects 
> foundations.  That code is part of what is proposed to be donated to the 
> apache-metaobjects project.

Thanks for that, I still a little unclear what code base is being donated. I’m 
just asking as that may make the IP transfers process easier or harder, you 
might have to get ICLA from people who have worked on it in the past for 
instance.

> Regarding the mentor list, I saw the Apache committer list included Heath, 
> but misunderstood that being a committer did not imply ASF membership, which 
> was my mistake.  However, Johan Edstrom and James Carman are members of the 
> IPMC, and Jamie mentioned to me that he has requested IPMC membership as 
> well.  In addition, James has mentored a few projects in the past.  We would 
> welcome another mentor if anyone is interested in helping build a community 
> around the project.

I think it would be best for a project if you had an mentor with some more 
experience. It can be difficult if you mentors go missing or if they may not be 
familiar with recent changes in infrastructure or the incubator.

Thanks,
Justin
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-07-29 Thread Nygard, Carl J

> > We believe that collaboration is a necessary prerequisite to great
> > outcomes. We believe that healthy debate is an aspect of collaboration
> > necessary for high-quality decisions.

> Where are the discussion and decisions like this being currently made about 
> the code base?

> Note, MetaObjects has been open-sourced in 2014 and is available at
> https://github.com/Draagon/draagon-metaobjects.

> I note that the last commit was 3 years ago and there not a lot of activity 
> and has only one committer. Is there any reason for that? I assume active 
> development is happening elsewhere? Is this repro doing to be the donation or 
> something else?

> > * Champion:  Johan Edstrom - Apache, Savoirtech
> > 
> > * Nominated Mentors:
> > ** James Carman - Apache, Cengage
> > ** Heath Kesler - Apache, Savoirtech
> > ** Jamie Goodyear - Apache, Savoirtech


> While anyone can help out, offical meanors need to part of the incubator PMC. 
> ASF members can ask to join the PMC, currently  Heath Kesler is not a ASF 
> member or IPMC member so would be unable to be an offical mentor. Have any of 
> these proposed mentors, mentors
>  an Apache project before? If not do they understand what is required of 
> them? I don’t believe Heath or Jamie have been active on this mailing list.

Currently most development is happening within Cengage.  The core library 
hasn't changed much, aside from some bug fixes.  Most of the recent innovation 
has occurred in other libraries built upon the draagon-metaobjects foundations. 
 That code is part of what is proposed to be donated to the apache-metaobjects 
project.

Regarding the mentor list, I saw the Apache committer list included Heath, but 
misunderstood that being a committer did not imply ASF membership, which was my 
mistake.  However, Johan Edstrom and James Carman are members of the IPMC, and 
Jamie mentioned to me that he has requested IPMC membership as well.  In 
addition, James has mentored a few projects in the past.  We would welcome 
another mentor if anyone is interested in helping build a community around the 
project.

--carl





-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-07-29 Thread Jamie Goodyear
Hi Justin,

 Resending a reply as the first one bounced.

 I have sent a request to private @ incubator RE joining the PMC. 

 This would be my first time acting as an official mentor, any guidance and
help would be greatly appreciated :) 


Cheers,
Jamie



--
Sent from: http://apache-incubator-general.996316.n3.nabble.com/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] MetaObjects for Apache Incubator

2019-07-28 Thread Justin Mclean
Hi,

Thanks for your proposal it sounds like an interesting project.

> We believe that collaboration is a necessary prerequisite to great
> outcomes. We believe that healthy debate is an aspect of collaboration
> necessary for high-quality decisions.

Where are the discussion and decisions like this being currently made about the 
code base?

> Note, MetaObjects has been open-sourced in 2014 and is available at
> https://github.com/Draagon/draagon-metaobjects.

I note that the last commit was 3 years ago and there not a lot of activity and 
has only one committer. Is there any reason for that? I assume active 
development is happening elsewhere? Is this repro doing to be the donation or 
something else?

> 
> * Champion:  Johan Edstrom - Apache, Savoirtech
> 
> * Nominated Mentors:
> ** James Carman - Apache, Cengage
> ** Heath Kesler - Apache, Savoirtech
> ** Jamie Goodyear - Apache, Savoirtech

While anyone can help out, offical meanors need to part of the incubator PMC. 
ASF members can ask to join the PMC, currently  Heath Kesler is not a ASF 
member or IPMC member so would be unable to be an offical mentor. Have any of 
these proposed mentors, mentors an Apache project before? If not do they 
understand what is required of them? I don’t believe Heath or Jamie have been 
active on this mailing list.

Thanks,
Justin


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] Apache DataSketches

2019-03-25 Thread leerho
I went ahead and performed the following searches based on the list someone
else provided.  Perhaps you can use this?

Note: the term "sketch" commonly refers to an artistic visualization or
drawing.
The use of the term "sketch" in the study of algorithms refer to a synopsis
of some larger set of data where the synopsis is approximate, simplified
(not all the detail), and can be executed quickly.  These properties are
shared with artistic sketches, but there the similarity ends. DataSketches
have nothing to do with visualization at all.

Search results.

https://github.com/search?o=desc=datasketches
returned links are indirect references to our site. or a reference to site
about data art.

https://opensource.google.com/projects/search?q=datasketches
No hits

https://sourceforge.net/directory/os:mac/?q=datasketches
No hits

https://www.openhub.net/p?ref=homepage=datasketches
No hits

https://www.trademarkia.com
No hits: "data sketch", "data sketches", "data-sketch", "data-sketches",
"datasketch", or "datasketches".

https://trademarks.justia.com/search?q=datasketches
No hits: "data sketch", "data sketches", "data-sketch", "data-sketches",
"datasketch", or "datasketches".

http://tmsearch.uspto.gov/
No hits: "data sketch", "data sketches", "data-sketch", "data-sketches",
"datasketch", or "datasketches".

https://www.google.com/search?q=datasketches=datasketches
About 37,600 results most all are indirect references to our site or to
sites about artistic visual renderings of data. Searching for
"datasketches" (with quotes) is a much smaller set (6800) that mostly refer
to our software.

https://en.wikipedia.org/wiki/datasketches
q: "datasketches": No hits
q: "data sketches" One hit: the common data science use of the pair of
words referring to sketching algorithms: "The different techniques can be
classified according to the data sketches they store."

https://stackoverflow.com/search?q=datasketches
2 hits that refer back to our software (Druid-datasketches is our software)
q:data sketches

https://www.linkedin.com/company/datasketches/about/
No hits

https://en.oxforddictionaries.com/search?filter=dictionary=datasketches
No hits

On Mon, Mar 25, 2019 at 1:36 PM Kenneth Knowles  wrote:

> The vote is passed to accept into the incubator. Since there is a cost to
> changing the name once infrastructure is set up, I suggest doing the name
> search immediately. There seemed to be some consensus to try to keep the
> DataSketches name. If there are no objections, I will file a
> PODLINGNAMESEARCH for this.
>
> Kenn
>
> On Tue, Feb 26, 2019 at 3:58 PM Liang Chen 
> wrote:
>
> > Hi Justin
> >
> > You are right, should be "Liang Chen", already updated it.
> >
> > Justin, could you please help to check my right to create new proposal on
> > incubator wiki at :
> > https://wiki.apache.org/incubator/ProjectProposals
> >
> > Regards
> > Liang
> >
> > Justin Mclean wrote
> > > Hi,
> > >
> > >> Currently only IPMC members can be official mentors, of the 3 people
> > >> listed here I believe only Jean-Baptiste Onofré is an IPMC member.
> > >
> > > Sorry, my apologies, Liang Chen is also an IPMC member, (Chen Liang,
> and
> > > presumedly a different person, is a committer but not an IPMC member)
> but
> > > I cannot find Gil Yehuda, do you mind provide a link to the roster for
> > > them or their Apache id?
> > >
> > > Thanks,
> > > Justin
> > > -
> > > To unsubscribe, e-mail:
> >
> > > general-unsubscribe@.apache
> >
> > > For additional commands, e-mail:
> >
> > > general-help@.apache
> >
> >
> >
> >
> >
> > --
> > Sent from: http://apache-incubator-general.996316.n3.nabble.com/
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>


Re: [PROPOSAL] Apache DataSketches

2019-03-25 Thread Kenneth Knowles
The vote is passed to accept into the incubator. Since there is a cost to
changing the name once infrastructure is set up, I suggest doing the name
search immediately. There seemed to be some consensus to try to keep the
DataSketches name. If there are no objections, I will file a
PODLINGNAMESEARCH for this.

Kenn

On Tue, Feb 26, 2019 at 3:58 PM Liang Chen  wrote:

> Hi Justin
>
> You are right, should be "Liang Chen", already updated it.
>
> Justin, could you please help to check my right to create new proposal on
> incubator wiki at :
> https://wiki.apache.org/incubator/ProjectProposals
>
> Regards
> Liang
>
> Justin Mclean wrote
> > Hi,
> >
> >> Currently only IPMC members can be official mentors, of the 3 people
> >> listed here I believe only Jean-Baptiste Onofré is an IPMC member.
> >
> > Sorry, my apologies, Liang Chen is also an IPMC member, (Chen Liang, and
> > presumedly a different person, is a committer but not an IPMC member) but
> > I cannot find Gil Yehuda, do you mind provide a link to the roster for
> > them or their Apache id?
> >
> > Thanks,
> > Justin
> > -
> > To unsubscribe, e-mail:
>
> > general-unsubscribe@.apache
>
> > For additional commands, e-mail:
>
> > general-help@.apache
>
>
>
>
>
> --
> Sent from: http://apache-incubator-general.996316.n3.nabble.com/
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [Proposal] Apache TVM

2019-03-02 Thread Tianqi Chen
Thanks Henry!

On Thu, Feb 28, 2019 at 10:57 AM Henry Saputra 
wrote:

> Thanks, Markus.
>
> Hope you do not mind but I have edited the proposal to reflect the changes.
> Since the people did not actually change, I think we can continue with the
> VOTE
>
>
> - Henry
>
> On Thu, Feb 28, 2019 at 10:20 AM Markus Weimer  wrote:
>
> > On Thu, Feb 28, 2019 at 9:36 AM Henry Saputra 
> > wrote:
> >
> > > > What I can do instead is to restructure the proposal to have PPMC to
> > > > include mentors and the PMC members from TVM.
> > > > And the rest of committers from TVM will invited from VOTE from PPMC.
> > >
> >
> > Yes, that is what I should have done in the final edits of the Proposal,
> > but did not do. This is how all other incubator projects I've been in
> have
> > done it: PPMC is mentors + leaders / founders / members of the inbound
> > project. For TVM, the most appropriate thing is to have the PPMC be
> mentors
> > + TVM's current PMC.
> >
> > If we agree on that, I'd like to make the change in the proposal, and
> leave
> > the vote open.
> >
> > Thanks for spotting this, Henry!
> >
> > Markus
> >
>


Re: [Proposal] Apache TVM

2019-02-28 Thread Markus Weimer
On Thu, Feb 28, 2019 at 10:57 AM Henry Saputra  wrote:
> Hope you do not mind but I have edited the proposal to reflect the changes.

Thanks!

Markus

On Thu, Feb 28, 2019 at 3:44 PM Markus Weimer  wrote:
>
> On Thu, Feb 28, 2019 at 10:57 AM Henry Saputra  
> wrote:
> > Hope you do not mind but I have edited the proposal to reflect the changes.
>
> Thanks!
>
> Markus

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal] Apache TVM

2019-02-28 Thread Markus Weimer
On Thu, Feb 28, 2019 at 10:57 AM Henry Saputra  wrote:
> Hope you do not mind but I have edited the proposal to reflect the changes.

Thanks!

Markus

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal] Apache TVM

2019-02-28 Thread Henry Saputra
Thanks, Markus.

Hope you do not mind but I have edited the proposal to reflect the changes.
Since the people did not actually change, I think we can continue with the
VOTE


- Henry

On Thu, Feb 28, 2019 at 10:20 AM Markus Weimer  wrote:

> On Thu, Feb 28, 2019 at 9:36 AM Henry Saputra 
> wrote:
>
> > > What I can do instead is to restructure the proposal to have PPMC to
> > > include mentors and the PMC members from TVM.
> > > And the rest of committers from TVM will invited from VOTE from PPMC.
> >
>
> Yes, that is what I should have done in the final edits of the Proposal,
> but did not do. This is how all other incubator projects I've been in have
> done it: PPMC is mentors + leaders / founders / members of the inbound
> project. For TVM, the most appropriate thing is to have the PPMC be mentors
> + TVM's current PMC.
>
> If we agree on that, I'd like to make the change in the proposal, and leave
> the vote open.
>
> Thanks for spotting this, Henry!
>
> Markus
>


Re: [Proposal] Apache TVM

2019-02-28 Thread Markus Weimer
On Thu, Feb 28, 2019 at 9:36 AM Henry Saputra 
wrote:

> > What I can do instead is to restructure the proposal to have PPMC to
> > include mentors and the PMC members from TVM.
> > And the rest of committers from TVM will invited from VOTE from PPMC.
>

Yes, that is what I should have done in the final edits of the Proposal,
but did not do. This is how all other incubator projects I've been in have
done it: PPMC is mentors + leaders / founders / members of the inbound
project. For TVM, the most appropriate thing is to have the PPMC be mentors
+ TVM's current PMC.

If we agree on that, I'd like to make the change in the proposal, and leave
the vote open.

Thanks for spotting this, Henry!

Markus


Re: [Proposal] Apache TVM

2019-02-28 Thread Henry Saputra
Hi TIanqi,

Actually for the initial committers, I believe can onboard them as part of
bootstrapping of project.

Any member of IPMC could keep me honest here too =)

Reference for Incubator PPMC for info:
https://incubator.apache.org/guides/ppmc.html

Thanks,

- Henry

On Thu, Feb 28, 2019 at 9:36 AM Henry Saputra 
wrote:

> HI Tianqi,
>
> What I can do instead is to restructure the proposal to have PPMC to
> include mentors and the PMC members from TVM.
> And the rest of committers from TVM will invited from VOTE from PPMC.
>
> Would that work?
>
> - Henry
>
> On Thu, Feb 28, 2019 at 2:13 AM Tianqi Chen 
> wrote:
>
>> Hi Henry:
>>
>> Because the TVM community already adopts Apache meritocracy and has a
>> separation of PMC and committers. Every new member(PMC and committers) are
>> formally discussed and we welcome each member in the community by
>> summarizing their contributions.
>> If possible,  we would like to keep the same structure during incubation.
>> The current PMC members are actively proposing new committers and PMC
>> members from different organizations in the past few months and will
>> continue doing so after the incubation.
>>
>> Tianqi
>>
>> On Wed, Feb 27, 2019 at 9:07 PM Henry Saputra 
>> wrote:
>>
>> > Bit more clarifications, as new podling in Apache, the initial members
>> of
>> > PPMC consist of mentors and initial commiters of the project.
>> >
>> > I understand TVM already work mirroring ASF meritoracy [1] but we need
>> to
>> > change the proposal to follow Apache guidelines to help us cross check
>> > membership later for onboarding.
>> >
>> > If it is OK with you I will change the proposal to merge the "Initial
>> PPMC
>> > Members" and "Initial Committers", minus the mentors from ASF, to be
>> just
>> > Initial Committers.
>> >
>> > Thanks,
>> >
>> > - Henry
>> >
>> >
>> > [1] https://github.com/dmlc/tvm/blob/master/CONTRIBUTORS.md
>> >
>> > On Tue, Feb 26, 2019 at 9:56 AM Markus Weimer 
>> wrote:
>> >
>> > > Thanks everyone for the discussion thus far. Based on it, I have
>> uploaded
>> > > an updated proposal here:
>> > >
>> > > https://wiki.apache.org/incubator/TVMProposal
>> > >
>> > > The changes made are:
>> > >
>> > >1. Rectify the language around PMC vs. PMC member. Thanks Greg, for
>> > >pointing that out!
>> > >2. Adding Furkan, Timothy and Henry as additional mentors. We can
>> use
>> > >all the help :)
>> > >
>> > > Assuming there are no further discussion points, I'd like to move
>> forward
>> > > with a [VOTE]. I'll let this sit here and simmer for another 24h to
>> make
>> > > sure we are done with the discussion phase.
>> > >
>> > > Thanks,
>> > >
>> > > Markus
>> > >
>> > >
>> > > On Mon, Feb 18, 2019 at 1:08 PM Tianqi Chen 
>> wrote:
>> > >
>> > > > Thanks, everyone for helpful feedbacks. I would like to clarify a
>> few
>> > > > points being raised so far on behalf of the current TVM PMC.
>> > > >
>> > > > > PMC vs PMC member
>> > > >
>> > > > Thanks for pointing it out. This is something we overlooked and will
>> > > update
>> > > > the proposal to make the change accordingly.
>> > > >
>> > > > > Champion
>> > > >
>> > > > Markus has been actively engaging with the TVM community and helped
>> the
>> > > > community start the incubation process. These efforts include:
>> > > > - Introduce the Apache way to in the TVM conference last Dec
>> > > >-
>> > > >
>> > >
>> >
>> https://sampl.cs.washington.edu/tvmconf/slides/Markus-Weimer-TVM-Apache.pdf
>> > > > - Help the community to start the incubation conversation(also
>> Thanks
>> > to
>> > > > Sebastian and Gon)
>> > > >- https://github.com/dmlc/tvm/issues/2401
>> > > > - Watch the pre-incubation private list, and give helpful feedback
>> > > >
>> > > > While we do not expect our mentor to actively watch the community on
>> > the
>> > > > daily basis(many of our committers only contribute a few days in a
>> > week),
>> > > > he has been very responsive and helped us to shape the incubation
>> > > proposal
>> > > > and most importantly be a strong advocate of the Apache way. I
>> > personally
>> > > > think he is more than qualified as our champion:)
>> > > >
>> > > > > Hardware artifact
>> > > >
>> > > > INAL, however, given that Apache only releases source code and our
>> > source
>> > > > code is in the form of software source code (HLS C and we are
>> moving to
>> > > > Chisel-(scala) ). Then anyone can take the software source code and
>> > > > generate unofficial hardware release.
>> > > >
>> > > > Tianqi
>> > > >
>> > > >
>> > > > On Mon, Feb 18, 2019 at 6:44 AM Bertrand Delacretaz <
>> > > > bdelacre...@codeconsult.ch> wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > On Mon, Feb 18, 2019 at 11:44 AM Justin Mclean <
>> > > jus...@classsoftware.com
>> > > > >
>> > > > > wrote:
>> > > > > > > If the Apache License works for those artifacts I think that's
>> > > > fine...
>> > > > > >
>> > > > > > It probably doesn’t, but it's complex and INAL, but I have
>> touched
>> > on

Re: [Proposal] Apache TVM

2019-02-28 Thread Henry Saputra
HI Tianqi,

What I can do instead is to restructure the proposal to have PPMC to
include mentors and the PMC members from TVM.
And the rest of committers from TVM will invited from VOTE from PPMC.

Would that work?

- Henry

On Thu, Feb 28, 2019 at 2:13 AM Tianqi Chen 
wrote:

> Hi Henry:
>
> Because the TVM community already adopts Apache meritocracy and has a
> separation of PMC and committers. Every new member(PMC and committers) are
> formally discussed and we welcome each member in the community by
> summarizing their contributions.
> If possible,  we would like to keep the same structure during incubation.
> The current PMC members are actively proposing new committers and PMC
> members from different organizations in the past few months and will
> continue doing so after the incubation.
>
> Tianqi
>
> On Wed, Feb 27, 2019 at 9:07 PM Henry Saputra 
> wrote:
>
> > Bit more clarifications, as new podling in Apache, the initial members of
> > PPMC consist of mentors and initial commiters of the project.
> >
> > I understand TVM already work mirroring ASF meritoracy [1] but we need to
> > change the proposal to follow Apache guidelines to help us cross check
> > membership later for onboarding.
> >
> > If it is OK with you I will change the proposal to merge the "Initial
> PPMC
> > Members" and "Initial Committers", minus the mentors from ASF, to be just
> > Initial Committers.
> >
> > Thanks,
> >
> > - Henry
> >
> >
> > [1] https://github.com/dmlc/tvm/blob/master/CONTRIBUTORS.md
> >
> > On Tue, Feb 26, 2019 at 9:56 AM Markus Weimer  wrote:
> >
> > > Thanks everyone for the discussion thus far. Based on it, I have
> uploaded
> > > an updated proposal here:
> > >
> > > https://wiki.apache.org/incubator/TVMProposal
> > >
> > > The changes made are:
> > >
> > >1. Rectify the language around PMC vs. PMC member. Thanks Greg, for
> > >pointing that out!
> > >2. Adding Furkan, Timothy and Henry as additional mentors. We can
> use
> > >all the help :)
> > >
> > > Assuming there are no further discussion points, I'd like to move
> forward
> > > with a [VOTE]. I'll let this sit here and simmer for another 24h to
> make
> > > sure we are done with the discussion phase.
> > >
> > > Thanks,
> > >
> > > Markus
> > >
> > >
> > > On Mon, Feb 18, 2019 at 1:08 PM Tianqi Chen  wrote:
> > >
> > > > Thanks, everyone for helpful feedbacks. I would like to clarify a few
> > > > points being raised so far on behalf of the current TVM PMC.
> > > >
> > > > > PMC vs PMC member
> > > >
> > > > Thanks for pointing it out. This is something we overlooked and will
> > > update
> > > > the proposal to make the change accordingly.
> > > >
> > > > > Champion
> > > >
> > > > Markus has been actively engaging with the TVM community and helped
> the
> > > > community start the incubation process. These efforts include:
> > > > - Introduce the Apache way to in the TVM conference last Dec
> > > >-
> > > >
> > >
> >
> https://sampl.cs.washington.edu/tvmconf/slides/Markus-Weimer-TVM-Apache.pdf
> > > > - Help the community to start the incubation conversation(also Thanks
> > to
> > > > Sebastian and Gon)
> > > >- https://github.com/dmlc/tvm/issues/2401
> > > > - Watch the pre-incubation private list, and give helpful feedback
> > > >
> > > > While we do not expect our mentor to actively watch the community on
> > the
> > > > daily basis(many of our committers only contribute a few days in a
> > week),
> > > > he has been very responsive and helped us to shape the incubation
> > > proposal
> > > > and most importantly be a strong advocate of the Apache way. I
> > personally
> > > > think he is more than qualified as our champion:)
> > > >
> > > > > Hardware artifact
> > > >
> > > > INAL, however, given that Apache only releases source code and our
> > source
> > > > code is in the form of software source code (HLS C and we are moving
> to
> > > > Chisel-(scala) ). Then anyone can take the software source code and
> > > > generate unofficial hardware release.
> > > >
> > > > Tianqi
> > > >
> > > >
> > > > On Mon, Feb 18, 2019 at 6:44 AM Bertrand Delacretaz <
> > > > bdelacre...@codeconsult.ch> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > On Mon, Feb 18, 2019 at 11:44 AM Justin Mclean <
> > > jus...@classsoftware.com
> > > > >
> > > > > wrote:
> > > > > > > If the Apache License works for those artifacts I think that's
> > > > fine...
> > > > > >
> > > > > > It probably doesn’t, but it's complex and INAL, but I have
> touched
> > on
> > > > > this about this in IoT talks at previous ApacheCons...
> > > > >
> > > > > FWIW the prior discussions that I mentioned are linked below - from
> > > > > board@ so accessible for ASF Members of Officers only, but we can
> > > > > distill them as needed if a concrete need appears with TVM.
> > > > >
> > > > > We didn't go past the discussions stage at that time (2011) but if
> > > > > there's another case of hardware at the ASF I'm willing to help
> > > > > restart those discussions 

Re: [Proposal] Apache TVM

2019-02-28 Thread Tianqi Chen
Hi Henry:

Because the TVM community already adopts Apache meritocracy and has a
separation of PMC and committers. Every new member(PMC and committers) are
formally discussed and we welcome each member in the community by
summarizing their contributions.
If possible,  we would like to keep the same structure during incubation.
The current PMC members are actively proposing new committers and PMC
members from different organizations in the past few months and will
continue doing so after the incubation.

Tianqi

On Wed, Feb 27, 2019 at 9:07 PM Henry Saputra 
wrote:

> Bit more clarifications, as new podling in Apache, the initial members of
> PPMC consist of mentors and initial commiters of the project.
>
> I understand TVM already work mirroring ASF meritoracy [1] but we need to
> change the proposal to follow Apache guidelines to help us cross check
> membership later for onboarding.
>
> If it is OK with you I will change the proposal to merge the "Initial PPMC
> Members" and "Initial Committers", minus the mentors from ASF, to be just
> Initial Committers.
>
> Thanks,
>
> - Henry
>
>
> [1] https://github.com/dmlc/tvm/blob/master/CONTRIBUTORS.md
>
> On Tue, Feb 26, 2019 at 9:56 AM Markus Weimer  wrote:
>
> > Thanks everyone for the discussion thus far. Based on it, I have uploaded
> > an updated proposal here:
> >
> > https://wiki.apache.org/incubator/TVMProposal
> >
> > The changes made are:
> >
> >1. Rectify the language around PMC vs. PMC member. Thanks Greg, for
> >pointing that out!
> >2. Adding Furkan, Timothy and Henry as additional mentors. We can use
> >all the help :)
> >
> > Assuming there are no further discussion points, I'd like to move forward
> > with a [VOTE]. I'll let this sit here and simmer for another 24h to make
> > sure we are done with the discussion phase.
> >
> > Thanks,
> >
> > Markus
> >
> >
> > On Mon, Feb 18, 2019 at 1:08 PM Tianqi Chen  wrote:
> >
> > > Thanks, everyone for helpful feedbacks. I would like to clarify a few
> > > points being raised so far on behalf of the current TVM PMC.
> > >
> > > > PMC vs PMC member
> > >
> > > Thanks for pointing it out. This is something we overlooked and will
> > update
> > > the proposal to make the change accordingly.
> > >
> > > > Champion
> > >
> > > Markus has been actively engaging with the TVM community and helped the
> > > community start the incubation process. These efforts include:
> > > - Introduce the Apache way to in the TVM conference last Dec
> > >-
> > >
> >
> https://sampl.cs.washington.edu/tvmconf/slides/Markus-Weimer-TVM-Apache.pdf
> > > - Help the community to start the incubation conversation(also Thanks
> to
> > > Sebastian and Gon)
> > >- https://github.com/dmlc/tvm/issues/2401
> > > - Watch the pre-incubation private list, and give helpful feedback
> > >
> > > While we do not expect our mentor to actively watch the community on
> the
> > > daily basis(many of our committers only contribute a few days in a
> week),
> > > he has been very responsive and helped us to shape the incubation
> > proposal
> > > and most importantly be a strong advocate of the Apache way. I
> personally
> > > think he is more than qualified as our champion:)
> > >
> > > > Hardware artifact
> > >
> > > INAL, however, given that Apache only releases source code and our
> source
> > > code is in the form of software source code (HLS C and we are moving to
> > > Chisel-(scala) ). Then anyone can take the software source code and
> > > generate unofficial hardware release.
> > >
> > > Tianqi
> > >
> > >
> > > On Mon, Feb 18, 2019 at 6:44 AM Bertrand Delacretaz <
> > > bdelacre...@codeconsult.ch> wrote:
> > >
> > > > Hi,
> > > >
> > > > On Mon, Feb 18, 2019 at 11:44 AM Justin Mclean <
> > jus...@classsoftware.com
> > > >
> > > > wrote:
> > > > > > If the Apache License works for those artifacts I think that's
> > > fine...
> > > > >
> > > > > It probably doesn’t, but it's complex and INAL, but I have touched
> on
> > > > this about this in IoT talks at previous ApacheCons...
> > > >
> > > > FWIW the prior discussions that I mentioned are linked below - from
> > > > board@ so accessible for ASF Members of Officers only, but we can
> > > > distill them as needed if a concrete need appears with TVM.
> > > >
> > > > We didn't go past the discussions stage at that time (2011) but if
> > > > there's another case of hardware at the ASF I'm willing to help
> > > > restart those discussions to move this forward. Either to define
> which
> > > > additions to the Apache License are required, or to clarify that it's
> > > > ok as is.
> > > >
> > > > So unless there are specific objections about accepting a project
> > > > which includes hardware as a software artifact I'm in favor of
> > > > accepting TVM and sorting out these things during incubation.
> > > >
> > > > -Bertrand
> > > >
> > > > Prior board@ discussions at https://s.apache.org/hw2011_1 and
> > > > https://s.apache.org/hw2011_2
> > > >
> > > > 

Re: [Proposal] Apache TVM

2019-02-27 Thread Henry Saputra
Bit more clarifications, as new podling in Apache, the initial members of
PPMC consist of mentors and initial commiters of the project.

I understand TVM already work mirroring ASF meritoracy [1] but we need to
change the proposal to follow Apache guidelines to help us cross check
membership later for onboarding.

If it is OK with you I will change the proposal to merge the "Initial PPMC
Members" and "Initial Committers", minus the mentors from ASF, to be just
Initial Committers.

Thanks,

- Henry


[1] https://github.com/dmlc/tvm/blob/master/CONTRIBUTORS.md

On Tue, Feb 26, 2019 at 9:56 AM Markus Weimer  wrote:

> Thanks everyone for the discussion thus far. Based on it, I have uploaded
> an updated proposal here:
>
> https://wiki.apache.org/incubator/TVMProposal
>
> The changes made are:
>
>1. Rectify the language around PMC vs. PMC member. Thanks Greg, for
>pointing that out!
>2. Adding Furkan, Timothy and Henry as additional mentors. We can use
>all the help :)
>
> Assuming there are no further discussion points, I'd like to move forward
> with a [VOTE]. I'll let this sit here and simmer for another 24h to make
> sure we are done with the discussion phase.
>
> Thanks,
>
> Markus
>
>
> On Mon, Feb 18, 2019 at 1:08 PM Tianqi Chen  wrote:
>
> > Thanks, everyone for helpful feedbacks. I would like to clarify a few
> > points being raised so far on behalf of the current TVM PMC.
> >
> > > PMC vs PMC member
> >
> > Thanks for pointing it out. This is something we overlooked and will
> update
> > the proposal to make the change accordingly.
> >
> > > Champion
> >
> > Markus has been actively engaging with the TVM community and helped the
> > community start the incubation process. These efforts include:
> > - Introduce the Apache way to in the TVM conference last Dec
> >-
> >
> https://sampl.cs.washington.edu/tvmconf/slides/Markus-Weimer-TVM-Apache.pdf
> > - Help the community to start the incubation conversation(also Thanks to
> > Sebastian and Gon)
> >- https://github.com/dmlc/tvm/issues/2401
> > - Watch the pre-incubation private list, and give helpful feedback
> >
> > While we do not expect our mentor to actively watch the community on the
> > daily basis(many of our committers only contribute a few days in a week),
> > he has been very responsive and helped us to shape the incubation
> proposal
> > and most importantly be a strong advocate of the Apache way. I personally
> > think he is more than qualified as our champion:)
> >
> > > Hardware artifact
> >
> > INAL, however, given that Apache only releases source code and our source
> > code is in the form of software source code (HLS C and we are moving to
> > Chisel-(scala) ). Then anyone can take the software source code and
> > generate unofficial hardware release.
> >
> > Tianqi
> >
> >
> > On Mon, Feb 18, 2019 at 6:44 AM Bertrand Delacretaz <
> > bdelacre...@codeconsult.ch> wrote:
> >
> > > Hi,
> > >
> > > On Mon, Feb 18, 2019 at 11:44 AM Justin Mclean <
> jus...@classsoftware.com
> > >
> > > wrote:
> > > > > If the Apache License works for those artifacts I think that's
> > fine...
> > > >
> > > > It probably doesn’t, but it's complex and INAL, but I have touched on
> > > this about this in IoT talks at previous ApacheCons...
> > >
> > > FWIW the prior discussions that I mentioned are linked below - from
> > > board@ so accessible for ASF Members of Officers only, but we can
> > > distill them as needed if a concrete need appears with TVM.
> > >
> > > We didn't go past the discussions stage at that time (2011) but if
> > > there's another case of hardware at the ASF I'm willing to help
> > > restart those discussions to move this forward. Either to define which
> > > additions to the Apache License are required, or to clarify that it's
> > > ok as is.
> > >
> > > So unless there are specific objections about accepting a project
> > > which includes hardware as a software artifact I'm in favor of
> > > accepting TVM and sorting out these things during incubation.
> > >
> > > -Bertrand
> > >
> > > Prior board@ discussions at https://s.apache.org/hw2011_1 and
> > > https://s.apache.org/hw2011_2
> > >
> > > -
> > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > For additional commands, e-mail: general-h...@incubator.apache.org
> > >
> > >
> >
>


Re: [Proposal] Apache TVM

2019-02-27 Thread Furkan KAMACI
Thanks Markus! Will be ready for the help!

27 Şub 2019 Çar, saat 20:32 tarihinde Henry Saputra 
şunu yazdı:

> Thanks, Marcus. Looking forward for the VOTE thread.
>
> This would be great addition to Apache Software Foundation.
>
> - Henry
>
> On Tue, Feb 26, 2019 at 9:56 AM Markus Weimer  wrote:
>
> > Thanks everyone for the discussion thus far. Based on it, I have uploaded
> > an updated proposal here:
> >
> > https://wiki.apache.org/incubator/TVMProposal
> >
> > The changes made are:
> >
> >1. Rectify the language around PMC vs. PMC member. Thanks Greg, for
> >pointing that out!
> >2. Adding Furkan, Timothy and Henry as additional mentors. We can use
> >all the help :)
> >
> > Assuming there are no further discussion points, I'd like to move forward
> > with a [VOTE]. I'll let this sit here and simmer for another 24h to make
> > sure we are done with the discussion phase.
> >
> > Thanks,
> >
> > Markus
> >
> >
> > On Mon, Feb 18, 2019 at 1:08 PM Tianqi Chen  wrote:
> >
> > > Thanks, everyone for helpful feedbacks. I would like to clarify a few
> > > points being raised so far on behalf of the current TVM PMC.
> > >
> > > > PMC vs PMC member
> > >
> > > Thanks for pointing it out. This is something we overlooked and will
> > update
> > > the proposal to make the change accordingly.
> > >
> > > > Champion
> > >
> > > Markus has been actively engaging with the TVM community and helped the
> > > community start the incubation process. These efforts include:
> > > - Introduce the Apache way to in the TVM conference last Dec
> > >-
> > >
> >
> https://sampl.cs.washington.edu/tvmconf/slides/Markus-Weimer-TVM-Apache.pdf
> > > - Help the community to start the incubation conversation(also Thanks
> to
> > > Sebastian and Gon)
> > >- https://github.com/dmlc/tvm/issues/2401
> > > - Watch the pre-incubation private list, and give helpful feedback
> > >
> > > While we do not expect our mentor to actively watch the community on
> the
> > > daily basis(many of our committers only contribute a few days in a
> week),
> > > he has been very responsive and helped us to shape the incubation
> > proposal
> > > and most importantly be a strong advocate of the Apache way. I
> personally
> > > think he is more than qualified as our champion:)
> > >
> > > > Hardware artifact
> > >
> > > INAL, however, given that Apache only releases source code and our
> source
> > > code is in the form of software source code (HLS C and we are moving to
> > > Chisel-(scala) ). Then anyone can take the software source code and
> > > generate unofficial hardware release.
> > >
> > > Tianqi
> > >
> > >
> > > On Mon, Feb 18, 2019 at 6:44 AM Bertrand Delacretaz <
> > > bdelacre...@codeconsult.ch> wrote:
> > >
> > > > Hi,
> > > >
> > > > On Mon, Feb 18, 2019 at 11:44 AM Justin Mclean <
> > jus...@classsoftware.com
> > > >
> > > > wrote:
> > > > > > If the Apache License works for those artifacts I think that's
> > > fine...
> > > > >
> > > > > It probably doesn’t, but it's complex and INAL, but I have touched
> on
> > > > this about this in IoT talks at previous ApacheCons...
> > > >
> > > > FWIW the prior discussions that I mentioned are linked below - from
> > > > board@ so accessible for ASF Members of Officers only, but we can
> > > > distill them as needed if a concrete need appears with TVM.
> > > >
> > > > We didn't go past the discussions stage at that time (2011) but if
> > > > there's another case of hardware at the ASF I'm willing to help
> > > > restart those discussions to move this forward. Either to define
> which
> > > > additions to the Apache License are required, or to clarify that it's
> > > > ok as is.
> > > >
> > > > So unless there are specific objections about accepting a project
> > > > which includes hardware as a software artifact I'm in favor of
> > > > accepting TVM and sorting out these things during incubation.
> > > >
> > > > -Bertrand
> > > >
> > > > Prior board@ discussions at https://s.apache.org/hw2011_1 and
> > > > https://s.apache.org/hw2011_2
> > > >
> > > > -
> > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > > For additional commands, e-mail: general-h...@incubator.apache.org
> > > >
> > > >
> > >
> >
>


Re: [Proposal] Apache TVM

2019-02-27 Thread Henry Saputra
Thanks, Marcus. Looking forward for the VOTE thread.

This would be great addition to Apache Software Foundation.

- Henry

On Tue, Feb 26, 2019 at 9:56 AM Markus Weimer  wrote:

> Thanks everyone for the discussion thus far. Based on it, I have uploaded
> an updated proposal here:
>
> https://wiki.apache.org/incubator/TVMProposal
>
> The changes made are:
>
>1. Rectify the language around PMC vs. PMC member. Thanks Greg, for
>pointing that out!
>2. Adding Furkan, Timothy and Henry as additional mentors. We can use
>all the help :)
>
> Assuming there are no further discussion points, I'd like to move forward
> with a [VOTE]. I'll let this sit here and simmer for another 24h to make
> sure we are done with the discussion phase.
>
> Thanks,
>
> Markus
>
>
> On Mon, Feb 18, 2019 at 1:08 PM Tianqi Chen  wrote:
>
> > Thanks, everyone for helpful feedbacks. I would like to clarify a few
> > points being raised so far on behalf of the current TVM PMC.
> >
> > > PMC vs PMC member
> >
> > Thanks for pointing it out. This is something we overlooked and will
> update
> > the proposal to make the change accordingly.
> >
> > > Champion
> >
> > Markus has been actively engaging with the TVM community and helped the
> > community start the incubation process. These efforts include:
> > - Introduce the Apache way to in the TVM conference last Dec
> >-
> >
> https://sampl.cs.washington.edu/tvmconf/slides/Markus-Weimer-TVM-Apache.pdf
> > - Help the community to start the incubation conversation(also Thanks to
> > Sebastian and Gon)
> >- https://github.com/dmlc/tvm/issues/2401
> > - Watch the pre-incubation private list, and give helpful feedback
> >
> > While we do not expect our mentor to actively watch the community on the
> > daily basis(many of our committers only contribute a few days in a week),
> > he has been very responsive and helped us to shape the incubation
> proposal
> > and most importantly be a strong advocate of the Apache way. I personally
> > think he is more than qualified as our champion:)
> >
> > > Hardware artifact
> >
> > INAL, however, given that Apache only releases source code and our source
> > code is in the form of software source code (HLS C and we are moving to
> > Chisel-(scala) ). Then anyone can take the software source code and
> > generate unofficial hardware release.
> >
> > Tianqi
> >
> >
> > On Mon, Feb 18, 2019 at 6:44 AM Bertrand Delacretaz <
> > bdelacre...@codeconsult.ch> wrote:
> >
> > > Hi,
> > >
> > > On Mon, Feb 18, 2019 at 11:44 AM Justin Mclean <
> jus...@classsoftware.com
> > >
> > > wrote:
> > > > > If the Apache License works for those artifacts I think that's
> > fine...
> > > >
> > > > It probably doesn’t, but it's complex and INAL, but I have touched on
> > > this about this in IoT talks at previous ApacheCons...
> > >
> > > FWIW the prior discussions that I mentioned are linked below - from
> > > board@ so accessible for ASF Members of Officers only, but we can
> > > distill them as needed if a concrete need appears with TVM.
> > >
> > > We didn't go past the discussions stage at that time (2011) but if
> > > there's another case of hardware at the ASF I'm willing to help
> > > restart those discussions to move this forward. Either to define which
> > > additions to the Apache License are required, or to clarify that it's
> > > ok as is.
> > >
> > > So unless there are specific objections about accepting a project
> > > which includes hardware as a software artifact I'm in favor of
> > > accepting TVM and sorting out these things during incubation.
> > >
> > > -Bertrand
> > >
> > > Prior board@ discussions at https://s.apache.org/hw2011_1 and
> > > https://s.apache.org/hw2011_2
> > >
> > > -
> > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > For additional commands, e-mail: general-h...@incubator.apache.org
> > >
> > >
> >
>


Re: [PROPOSAL] Apache DataSketches

2019-02-26 Thread Liang Chen
Hi Justin

You are right, should be "Liang Chen", already updated it.

Justin, could you please help to check my right to create new proposal on
incubator wiki at :
https://wiki.apache.org/incubator/ProjectProposals

Regards
Liang

Justin Mclean wrote
> Hi,
> 
>> Currently only IPMC members can be official mentors, of the 3 people
>> listed here I believe only Jean-Baptiste Onofré is an IPMC member.
> 
> Sorry, my apologies, Liang Chen is also an IPMC member, (Chen Liang, and
> presumedly a different person, is a committer but not an IPMC member) but
> I cannot find Gil Yehuda, do you mind provide a link to the roster for
> them or their Apache id?
> 
> Thanks,
> Justin
> -
> To unsubscribe, e-mail: 

> general-unsubscribe@.apache

> For additional commands, e-mail: 

> general-help@.apache





--
Sent from: http://apache-incubator-general.996316.n3.nabble.com/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Proposal] Apache TVM

2019-02-26 Thread Markus Weimer
Thanks everyone for the discussion thus far. Based on it, I have uploaded
an updated proposal here:

https://wiki.apache.org/incubator/TVMProposal

The changes made are:

   1. Rectify the language around PMC vs. PMC member. Thanks Greg, for
   pointing that out!
   2. Adding Furkan, Timothy and Henry as additional mentors. We can use
   all the help :)

Assuming there are no further discussion points, I'd like to move forward
with a [VOTE]. I'll let this sit here and simmer for another 24h to make
sure we are done with the discussion phase.

Thanks,

Markus


On Mon, Feb 18, 2019 at 1:08 PM Tianqi Chen  wrote:

> Thanks, everyone for helpful feedbacks. I would like to clarify a few
> points being raised so far on behalf of the current TVM PMC.
>
> > PMC vs PMC member
>
> Thanks for pointing it out. This is something we overlooked and will update
> the proposal to make the change accordingly.
>
> > Champion
>
> Markus has been actively engaging with the TVM community and helped the
> community start the incubation process. These efforts include:
> - Introduce the Apache way to in the TVM conference last Dec
>-
> https://sampl.cs.washington.edu/tvmconf/slides/Markus-Weimer-TVM-Apache.pdf
> - Help the community to start the incubation conversation(also Thanks to
> Sebastian and Gon)
>- https://github.com/dmlc/tvm/issues/2401
> - Watch the pre-incubation private list, and give helpful feedback
>
> While we do not expect our mentor to actively watch the community on the
> daily basis(many of our committers only contribute a few days in a week),
> he has been very responsive and helped us to shape the incubation proposal
> and most importantly be a strong advocate of the Apache way. I personally
> think he is more than qualified as our champion:)
>
> > Hardware artifact
>
> INAL, however, given that Apache only releases source code and our source
> code is in the form of software source code (HLS C and we are moving to
> Chisel-(scala) ). Then anyone can take the software source code and
> generate unofficial hardware release.
>
> Tianqi
>
>
> On Mon, Feb 18, 2019 at 6:44 AM Bertrand Delacretaz <
> bdelacre...@codeconsult.ch> wrote:
>
> > Hi,
> >
> > On Mon, Feb 18, 2019 at 11:44 AM Justin Mclean  >
> > wrote:
> > > > If the Apache License works for those artifacts I think that's
> fine...
> > >
> > > It probably doesn’t, but it's complex and INAL, but I have touched on
> > this about this in IoT talks at previous ApacheCons...
> >
> > FWIW the prior discussions that I mentioned are linked below - from
> > board@ so accessible for ASF Members of Officers only, but we can
> > distill them as needed if a concrete need appears with TVM.
> >
> > We didn't go past the discussions stage at that time (2011) but if
> > there's another case of hardware at the ASF I'm willing to help
> > restart those discussions to move this forward. Either to define which
> > additions to the Apache License are required, or to clarify that it's
> > ok as is.
> >
> > So unless there are specific objections about accepting a project
> > which includes hardware as a software artifact I'm in favor of
> > accepting TVM and sorting out these things during incubation.
> >
> > -Bertrand
> >
> > Prior board@ discussions at https://s.apache.org/hw2011_1 and
> > https://s.apache.org/hw2011_2
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>


  1   2   3   4   5   6   7   8   9   10   >