Re: [VOTE] accept Tashi into the Incubator
The voting appears to have ended, with 14+ votes and zero negative votes, and three volunteers to be mentors (thank you!): 1. Matthieu Riou ([EMAIL PROTECTED]) 2. Craig L Russell ([EMAIL PROTECTED]) 3. Paul Freemantle ([EMAIL PROTECTED]) What is the next step for admission into the incubator? Dave O On Thu, Aug 14, 2008 at 10:11 PM, Matthieu Riou <[EMAIL PROTECTED]> wrote: > So shouldn't this vote get tallied now? Seems that we're well passed the 72 > hours. > > Matthieu > > On Thu, Aug 14, 2008 at 12:44 PM, Matt Hogstrom <[EMAIL PROTECTED]> wrote: > >> +1 >> >> >> On Aug 4, 2008, at 1:48 PM, Doug Cutting wrote: >> >> Please vote on accepting Tashi into the Incubator. >>> >>> Tashi's proposal is at: >>> >>> http://wiki.apache.org/incubator/TashiProposal >>> >>> Thanks! >>> >>> Doug >>> >>> >>> - >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >>> >> >> - >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > -- -- David O'Hallaron -- Director, Intel Research Pittsburgh -- Assoc Prof of CS and ECE, Carnegie Mellon University -- http://www.cs.cmu.edu/~droh - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [PROPOSAL] Tashi
No worries. I've removed the entry on the wiki version of the proposal at http://wiki.apache.org/incubator/TashiProposal. It now reads simply: Initially, there will be one committer each from Carnegie Mellon and Intel Research: * Michael Stroucken ([EMAIL PROTECTED]) * Michael Ryan ([EMAIL PROTECTED]) Dave On Tue, Jul 22, 2008 at 8:36 PM, William A. Rowe, Jr. <[EMAIL PROTECTED]> wrote: > Doug Cutting wrote: >> >> Noel J. Bergman wrote: >>> >>> With respect to "Initially, we plan to start with one committer each from >>> Carnegie Mellon and Intel Research, with a Yahoo committer to be >>> determined >>> later", that's awkwardly phrased. It appears to imply a corporate >>> representative doing commits for "hidden" people, something that we >>> consider >>> to be an anti-pattern. >> >> That is not the intent. The intent is for Yahoo! to assign someone to >> work on this project as a direct contributor. But that person has not yet >> been identified. >> >>> Whomever is to be actively involved in development >>> should be on the committer list. If it is just a community of just The >>> Two >>> Michaels, fine, but the wording should be rephrased. >> >> +1 If Y! does not name someone soon, then that entry should be removed. A >> committer from Y! can always be added later, based on merit. > > No; the entry should be removed now. > > Yahoo the Company cannot place a reservation on a place at the table for > an unnamed body. This is not how the ASF works. Nor are committers > expressed as delegates of the institutions the work for/study at. > This places a cloud over the acceptance of this particular project, and > I would encourage everyone to be sure their mentors/champions review the > text before posting a proposal for incubation.. > > Bill > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- -- David O'Hallaron -- Director, Intel Research Pittsburgh -- Assoc Prof of CS and ECE, Carnegie Mellon University -- http://www.cs.cmu.edu/~droh - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [PROPOSAL] Tashi
Matthieu, Thanks very much. I'll add you as a proposed mentor on the wiki proposal. http://wiki.apache.org/incubator/TashiProposal Dave On Tue, Jul 15, 2008 at 3:00 PM, Matthieu Riou <[EMAIL PROTECTED]> wrote: > On Tue, Jul 15, 2008 at 6:55 AM, David O'Hallaron <[EMAIL PROTECTED]> wrote: > >> Matthieu, >> >> > * The sponsoring entity is the Incubator so I'm guessing you're shooting >> > for graduating as a TLP. What kind of interactions do you foresee with >> > Hadoop for example? >> >> We talked with Doug Cutting about whether to shoot for a TLP or a >> subproject under Hadoop. We decided ultimately to go the TLP route >> because Tashi and Hadoop are at different levels in the stack. >> However, we see Hadoop as one of the important applications running on >> Tashi virtual clusters, so the two projects are quite complementary. >> >> > * IIC Tashi is only about management, not about the underlying >> > storage/computing technology. Is that correct? And if so which ones do >> you >> > plan to integrate with? >> >> Yes, that's correct. We plan to integrate with the major VMMs, such as >> Xen, Linux KVM and VMWare. For storage, we're looking at integrating >> with HDFS, pVFS and later pNFS when it becomes more mature.An >> important goal is to provide the hooks and interfaces that allow any >> DFS and VMM vendor to integrate with the system. >> >> > * I can't help asking for more technical details. What's the >> > implementation language for your POC code? What are the non-proprietary >> > interfaces you're thinking of? >> >> The POC code is a couple of thousand lines of original Python code. >> Major components are cluster manager (cm), which runs on one of the >> cluster nodes, a node manager (nm), which runs on each of the other >> physical cluster nodes, a simple db on the cluster manager for >> configuration data, and some client utilities. >> >> We're really thinking hard about interfaces now, but don't have clear >> definitions yet. The kinds of things wer are thinking about are >> interfaces between the client and cm and cm and nm for manipulating >> starting, starting, and migrating vms, interfaces for some kind of >> event/messaging system for monitoring and reporting system state to >> the cm and client, interfaces between the cm/nm and storage system to >> allow the cm to do storage-aware scheduling of vms, interfaces to >> power management and system management hardware features, and possibly >> interfaces for federating different Tashi clusters. >> > > > Okay, sounds good to me, thanks for the clarifications. > > Also if you need another mentor, you can count me in. > > Cheers, > Matthieu > > > >> >> Thanks, >> >> Dave >> >> -- >> -- David O'Hallaron >> -- Director, Intel Research Pittsburgh >> -- Assoc Prof of CS and ECE, Carnegie Mellon University >> -- http://www.cs.cmu.edu/~droh <http://www.cs.cmu.edu/%7Edroh> >> >> - >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > -- -- David O'Hallaron -- Director, Intel Research Pittsburgh -- Assoc Prof of CS and ECE, Carnegie Mellon University -- http://www.cs.cmu.edu/~droh - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [PROPOSAL] Tashi
Matthieu, > * The sponsoring entity is the Incubator so I'm guessing you're shooting > for graduating as a TLP. What kind of interactions do you foresee with > Hadoop for example? We talked with Doug Cutting about whether to shoot for a TLP or a subproject under Hadoop. We decided ultimately to go the TLP route because Tashi and Hadoop are at different levels in the stack. However, we see Hadoop as one of the important applications running on Tashi virtual clusters, so the two projects are quite complementary. > * IIC Tashi is only about management, not about the underlying > storage/computing technology. Is that correct? And if so which ones do you > plan to integrate with? Yes, that's correct. We plan to integrate with the major VMMs, such as Xen, Linux KVM and VMWare. For storage, we're looking at integrating with HDFS, pVFS and later pNFS when it becomes more mature.An important goal is to provide the hooks and interfaces that allow any DFS and VMM vendor to integrate with the system. > * I can't help asking for more technical details. What's the > implementation language for your POC code? What are the non-proprietary > interfaces you're thinking of? The POC code is a couple of thousand lines of original Python code. Major components are cluster manager (cm), which runs on one of the cluster nodes, a node manager (nm), which runs on each of the other physical cluster nodes, a simple db on the cluster manager for configuration data, and some client utilities. We're really thinking hard about interfaces now, but don't have clear definitions yet. The kinds of things wer are thinking about are interfaces between the client and cm and cm and nm for manipulating starting, starting, and migrating vms, interfaces for some kind of event/messaging system for monitoring and reporting system state to the cm and client, interfaces between the cm/nm and storage system to allow the cm to do storage-aware scheduling of vms, interfaces to power management and system management hardware features, and possibly interfaces for federating different Tashi clusters. Thanks, Dave -- -- David O'Hallaron -- Director, Intel Research Pittsburgh -- Assoc Prof of CS and ECE, Carnegie Mellon University -- http://www.cs.cmu.edu/~droh - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [PROPOSAL] Tashi
Noel, I've fixed the wording on the wiki text to clear up the initial committers: We've been talking with the storage group at HP, haven't approached the others yet, but would certainly welcome them. Thanks! Dave http://wiki.apache.org/incubator/TashiProposal *** David, I just reviewed http://wiki.apache.org/incubator/TashiProposal. Interesting. Has anyone been in touch with VMware, XenSource, Google, Amazon, et al to invite them to participate? With respect to "Initially, we plan to start with one committer each from Carnegie Mellon and Intel Research, with a Yahoo committer to be determined later", that's awkwardly phrased. It appears to imply a corporate representative doing commits for "hidden" people, something that we consider to be an anti-pattern. Whomever is to be actively involved in development should be on the committer list. If it is just a community of just The Two Michaels, fine, but the wording should be rephrased. --- Noel -- -- David O'Hallaron -- Director, Intel Research Pittsburgh -- Assoc Prof of CS and ECE, Carnegie Mellon University -- http://www.cs.cmu.edu/~droh - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[PROPOSAL] Tashi
This is a proposal to enter the incubator. See http://wiki.apache.org/incubator/TashiProposal for the most up-to-date version. We're looking forward to comments from the community. Thanks! Dave -- -- David O'Hallaron, -- Director, Intel Research Pittsburgh -- Assoc Prof of CS and ECE, Carnegie Mellon University -- http://www.cs.cmu.edu/~droh ---cut-- = Tashi Proposal = A proposal to the Apache Software Foundation Incubator PMC by David O'Hallaron^*+^, Michael Kozuch^*^, Michael Ryan^*^, Steven Schlosser^*^, Jim Cipar^+^, Greg Ganger^+^, Garth Gibson^+^, Julio Lopez^+^, Michael Strouken^+^, Wittawat Tantisiriroj^+^, Doug Cutting^#^, Jay Kistler^#^, Thomas Kwan^#^ ^*^Intel Research Pittsburgh, ^+^Carnegie Mellon University, ^#^Yahoo! July 10, 2008 == 1. Abstract == Tashi is a cluster management system for cloud computing on Big Data. == 2. Proposal == The Tashi project aims to build a software infrastructure for cloud computing on massive internet-scale datasets (what we call ''Big Data''). The idea is to build a cluster management system that enables the Big Data that are stored in a cluster/data center to be accessed, shared, manipulated, and computed on by remote users in a convenient, efficient, and safe manner. The system aims to provide the following basic capabilities: (a) ''On-demand provisioning of storage and compute resources.'' Users request a number of compute nodes, which can be either virtual or physical machines, and a set of disk images to boot up on the nodes. In response they receive their own persistent logical cluster of compute and storage nodes, which they can then manage and use. (b) ''Extensible end-to-end system management.'' Tashi will define open non-proprietary interfaces for management tasks such as observation, inference, planning, and actuation. This will keep the system vendor-neutral and allow different research and development groups to plug in different implementations of different management modules. (c) ''Cooperative storage and compute management.'' The system will define new non-proprietary interfaces and methods that will allow compute and storage management to work together in concert. (d) ''Flexible storage models.'' The system will support a range of different storage models, such as network-attached storage, per-node storage, and hybrids, to allow developers, researchers, and large scale cluster/data center operators to experiment with different kinds of file systems. (e) ''Flexible machine models.'' The system will support different machine models. In particular, it will be VMM-agnostic, able to run different virtual machine monitors such as KVM and Xen. Also, in order to address the cluster squatting problem (when clusters are balkanized by users who reserve and hold nodes for their exclusive use) the system will support a novel bi-model booting capability, in which virtual machine and physical machine instances can boot from the same disk image. == 3. Rationale and Approach == Digital media, pervasive sensing, web authoring, mobile computing, scientific and medical instruments, physical simulations, and virtual worlds are all delivering vast new datasets relating to every aspect of our lives. A growing fraction of this Big Data is going unused or being underexploited due to the overwhelming scale of the data involved. Effective sharing, understanding, and use of this new wealth of raw information poses one of the great challenges for the new century. In order to compute on this emerging Big Data, many research and development groups are purchasing their own racks of compute and storage servers. The goal of the Tashi project is to develop a layer of utility software that turns these raw racks of servers into easily managed cloud computers that will allow remote users to share and explore their Big Data. To our knowledge there are no open source projects addressing cluster management for Big Data applications. We need a project such as Tashi for a number of reasons: (1) No cloud computing cluster management systems have tackled the problem of having both compute and storage management working together in concert, which we believe will be necessary to support Big Data. (2) We need non-proprietary interfaces for cloud computing, and open source is the way to develop these. For example, Google's new App Engine and Amazon's web services require people to build to proprietary API's, so that their applications are no longer vendor neutral, but are tied to a particular service provider. (3) We need an extensible system that can serve as a platform to stimulate research in cluster management for cloud computing. The Tashi system is targeted at two (not always distinct) communities: (1) As a production system for organizations who want to offer medium to large scale clusters to their users.