Hello, everyone --as promised, we are live:
- NASDAQ Globenewswire http://globenewswire.com/news-release/2015/04/27/728529/10130773/en/The-Apache-Software-Foundation-Announces-Apache-tm-Parquet-tm-as-a-Top-Level-Project.html - ASF "Foundation" blog http://s.apache.org/L0H - @TheASF Twitter feed https://twitter.com/TheASF/status/592644433813884929 ...plus [email protected] and our dedicated media/analyst list. This will appear on the apache.org homepage and the mail archives during the next auto-update, which should take place within the hour. Thanks again for all your help, and congratulations on reaching this milestone! Warmly, Sally ________________________________ From: Sally Khudairi <[email protected]> To: Julien Le Dem <[email protected]> Cc: "[email protected]" <[email protected]>; Sally Khudairi <[email protected]>; Daniel Weeks <[email protected]>; Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; "[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" <[email protected]>; "[email protected]" <[email protected]> Sent: Sunday, 26 April 2015, 19:22 Subject: Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?] Perfect. Thank you, Julien! I'll confirm once we're live tomorrow morning. Warmly, Sally ________________________________ From: Julien Le Dem <[email protected]> To: Sally Khudairi <[email protected]> Cc: "[email protected]" <[email protected]>; Sally Khudairi <[email protected]>; Daniel Weeks <[email protected]>; Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; "[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" <[email protected]>; "[email protected]" <[email protected]> Sent: Sunday, 26 April 2015, 19:21 Subject: Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?] Sounds good. Thank you! On Sunday, April 26, 2015, Sally Khudairi <[email protected]> wrote: Thanks, Julien --I can include that, yes. > >Does this work for you? > > ><snip> > >Catch Apache Parquet in action at the Hadoop Summit, 9-11 June 2015 in San >Jose, California. The Apache Parquet project welcomes contributions and >community participation through mailing lists, face-to-face MeetUps, and user >events. For more information, visit http://parquet.apache.org/community/ > ></snip> > > >Warmest regards, >Sally > > > >Did you want to mention the parquet talks at the Hadoop summit in June? >Otherwise this looks good to me. > > > > >On Sunday, April 26, 2015, Sally Khudairi <[email protected]> >wrote: > >Hi everyone --I haven't received any other feedback, so I think we're all set >to announce tomorrow. >>I'd like to issue the press release at at 7AM ET. I'll confirm when we're >>live. >>If there are any showstoppers, please let me know ASAP. >>Thanks so much,Sally >> >> From: Sally Khudairi <[email protected]> >> To: Sally Khudairi <[email protected]>; Daniel Weeks >> <[email protected]>; "[email protected]" >> <[email protected]> >>Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; >>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" >><[email protected]>; "[email protected]" <[email protected]> >> Sent: Friday, 24 April 2015, 16:17 >> Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation >> blog post?] >> >>Hello again, everyone --below is the latest draft. >> >>Please review and forward any changes/additions no later than 5PM ET on >>Sunday in order for us to announce on Monday morning. I was aiming to go live >>by 7AM ET if that works for you. >> >>Kindly confirm. >> >>Thanks in advance, >>Sally >> >>= = = >> >>DRAFT :: NOT FOR DISTRIBUTION >> >>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level >>Project >> >>Open Source storage format for the Apache™ Hadoop® ecosystem in use at >>Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations >> >>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the >>all-volunteer developers, stewards, and incubators of more than 350 Open >>Source projects and initiatives, announced today that Apache™ Parquet™ has >>graduated from the Apache Incubator to become a Top-Level Project (TLP), >>signifying that the project's community and products have been well-governed >>under the ASF's meritocratic process and principles. >> >>"The incubation process at Apache has been fantastic and really the last step >>of making Parquet a community driven standard fully integrated within the >>greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache >>Parquet. >> >>Apache Parquet is an Open Source columnar storage format for the Apache™ >>Hadoop® ecosystem, built to work across programming languages and much more: >> >> >> - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, >> Crunch, Kite) >> - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs) >> - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, >> Apache Pig, Presto, Apache Spark SQL) >> >>"At Twitter, Parquet has helped us scale our big data usage by in some cases >>reducing storage requirements by one third on large datasets as well as scan >>and deserialization time. This translated into hardware savings as well as >>reduced latency for accessing the data. Furthermore, Parquet being integrated >>with so many tools creates opportunities and flexibility regarding query >>engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, >>it's just fantastic to see it graduate to a top-level project and we look >>forward to further collaborating with the Apache Parquet community to >>continually improve performance." >> >>"Parquet’s integration with other object models, like Avro and Thrift, has >>been a key feature for our customers," said Ryan Blue, Software Engineer at >>Cloudera. "They can take advantage of columnar storage without changing the >>classes they already use in their production applications." >> >>"At Netflix, Parquet is the primary storage format for data warehousing. More >>than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that >>we query across a wide range of tools including Apache Hive, Apache Pig, >>Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit >>of columnar projection and statistics is a game changer for our big data >>platform," said Daniel Weeks, Software Engineer at Netflix. "We look forward >>to working with the Apache community to advance the state of big data storage >>with Parquet and are excited to see the project graduate to full Apache >>status." >> >>"Stripe's data warehouse has been built on Parquet from the beginning," said >>Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, >>from data import to machine learning to adhoc SQL analysis, uses Apache >>Parquet as the common interchange format." >> >>"I was extremely happy to see Parquet arrive as an Incubator project," said >>Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, >>Instrument and Science Data Systems Section at NASA Jet Propulsion >>Laboratory. "After talking with some in its community there was a real match >>with this columnar data format technology and its community with the way that >>we do things here at the ASF. Parquet has had an exemplar Incubation, and the >>project has big things ahead of it. I am encouraging my Data Science Team at >>NASA to evaluate it for data representation especially as it relates to our >>science holdings in Earth, planetary and space sciences, and astrophysics." >> >>The Apache Parquet project welcomes contributions and community participation >>through mailing lists, face-to-face MeetUps, and user events. For more >>information, visit http://parquet.apache.org/community/ >> >>Availability and Oversight >>Apache Parquet software is released under the Apache License v2.0 and is >>overseen by a self-selected team of active contributors to the project. A >>Project Management Committee (PMC) guides the Project's day-to-day >>operations, including community development and product releases. For >>downloads, documentation, and ways to become involved with Apache Parquet, >>visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet >> >>About the Apache Incubator >>The Apache Incubator is the entry path for projects and codebases wishing to >>become part of the efforts at The Apache Software Foundation. All code >>donations from external organizations and existing external projects wishing >>to join the ASF enter through the Incubator to: 1) ensure all donations are >>in accordance with the ASF legal standards; and 2) develop new communities >>that adhere to our guiding principles. Incubation is required of all newly >>accepted projects until a further review indicates that the infrastructure, >>communications, and decision making process have stabilized in a manner >>consistent with other successful ASF projects. While incubation status is not >>necessarily a reflection of the completeness or stability of the code, it >>does indicate that the project has yet to be fully endorsed by the ASF. For >>more information, visit http://incubator.apache.org/. >> >>About The Apache Software Foundation (ASF) >>Established in 1999, the all-volunteer Foundation oversees more than 350 >>leading Open Source projects, including Apache HTTP Server --the world's most >>popular Web server software. Through the ASF's meritocratic process known as >>"The Apache Way," more than 500 individual Members and 4,500 Committers >>successfully collaborate to develop freely available enterprise-grade >>software, benefiting millions of users worldwide: thousands of software >>solutions are distributed under the Apache License; and the community >>actively participates in ASF mailing lists, mentoring initiatives, and >>ApacheCon, the Foundation's official user conference, trainings, and expo. >>The ASF is a US 501(c)(3) charitable organization, funded by individual >>donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, >>Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion >>Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and >>Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF >>on Twitter. >> >>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", >>"Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", >>"Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", >>"Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or >>trademarks of the Apache Software Foundation in the United States and/or >>other countries. All other brands and trademarks are the property of their >>respective owners. >> >># # # >> >>[MEDIA CONTACT:SALLY] >>________________________________ >> >> >>From: Sally Khudairi <[email protected]> >>To: Sally Khudairi <[email protected]>; Daniel Weeks >><[email protected]>; "[email protected]" >><[email protected]> >>Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; >>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" >><[email protected]>; "[email protected]" <[email protected]> >>Sent: Friday, 24 April 2015, 13:56 >>Subject: Re: Graduation blog post? >> >> >> >>Done. >> >>ALL: can you please let me know if there are any events that Parquet will be >>at? Presenting? Hosting? etc. >> >>Thank you! >> >>-Sally >> >> >> >> >> >>________________________________ >>From: Sally Khudairi <[email protected]> >>To: Daniel Weeks <[email protected]>; "[email protected]" >><[email protected]> >>Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; >>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" >><[email protected]>; "[email protected]" <[email protected]> >>Sent: Friday, 24 April 2015, 13:40 >>Subject: Re: Graduation blog post? >> >> >> >>Of course --I'll fix that now! >> >>Sorry about that, Daniel. >> >>-Sally >> >> >> >> >> >> >>________________________________ >>From: Daniel Weeks <[email protected]> >>To: [email protected]; Sally Khudairi <[email protected]> >>Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; >>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" >><[email protected]>; "[email protected]" <[email protected]> >>Sent: Friday, 24 April 2015, 13:38 >>Subject: Re: Graduation blog post? >> >> >> >>Sally, >> >>Just wanted to comment that my last name is misspelled in the Netflix >>testimonial. Can someone fix that? (it's Weeks, not Week) >> >>Thanks, >>Dan >> >> >> >> >>On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi >><[email protected]> wrote: >> >>Hi everyone --there's been the addition of a quote from Stripe: >>> >>>"Stripe's data warehouse has been built on Parquet from the beginning," said >>>Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, >>>from data import to machine learning to adhoc SQL analysis, uses Apache >>>Parquet as the common interchange format." >>> >>> >>>--please note that I added "Apache" to "Parquet" in the second sentence. >>>Stripe has also been added to the sub-head. >>> >>>Are we waiting for quotes from anyone else? If not, I can add a closing >>>sentence and forward the final copy later today. >>> >>>Thanks so much, >>>Sally >>> >>> >>> >>>----- Original Message ----- >>> >>>From: Sally Khudairi <[email protected]> >>>To: Chris Aniszczyk <[email protected]>; >>>"[email protected]" <[email protected]> >>>Cc: Ryan Blue <[email protected]>; "[email protected]" >>><[email protected]>; "Mattmann, Chris A (3980)" >>><[email protected]>; "[email protected]" <[email protected]> >>>Sent: Thursday, 23 April 2015, 15:25 >>>Subject: Re: Graduation blog post? >>> >>>Hello everyone --below is the draft thus far. >>> >>> >>>I was aiming to announce on Monday by 7AM ET, but noticed that we're waiting >>>for additional quotes. >>> >>>Also, should we get a closing quote from Julien? Perhaps something that >>>invites additional community participation? >>> >>>Please let me know your thoughts. >>> >>>Thanks so much, >>>Sally >>> >>>= = = >>> >>>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level >>>Project >>> >>>Open Source storage format for the Apache™ Hadoop® ecosystem in use at >>>Cloudera, NASA, Netflix, and Twitter, among other organizations >>> >>>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the >>>all-volunteer developers, stewards, and incubators of more than 350 Open >>>Source projects and initiatives, announced today that Apache™ Parquet™ has >>>graduated from the Apache Incubator to become a Top-Level Project (TLP), >>>signifying that the project's community and products have been well-governed >>>under the ASF's meritocratic process and principles. >>> >>>"The incubation process at Apache has been fantastic and really the last >>>step of making Parquet a community driven standard fully integrated within >>>the greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache >>>Parquet. >>> >>>Apache Parquet is an Open Source columnar storage format for the Apache™ >>>Hadoop® ecosystem, built to work across programming languages and much more: >>>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, >>>Crunch, Kite) >>>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs) >>>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, >>>Apache Pig, Presto, Apache Spark SQL) >>> >>>"At Twitter, Parquet has helped us scale our big data usage by in some cases >>>reducing storage requirements by one third on large datasets as well as scan >>>and deserialization time. This translated into hardware savings as well as >>>reduced latency for accessing the data. Furthermore, Parquet being >>>integrated with so many tools creates opportunities and flexibility >>>regarding query engines," said Chris Aniszczyk, Head of Open Source at >>>Twitter. "Finally, it's just fantastic to see it graduate to a top-level >>>project and we look forward to further collaborating with the Apache Parquet >>>community to continually improve performance." >>> >>>"Parquet’s integration with other object models, like Avro and Thrift, has >>>been a key feature for our customers," said Ryan Blue, Software Engineer at >>>Cloudera. "They can take advantage of columnar storage without changing the >>>classes they already use in their production applications." >>> >>>"At Netflix, Parquet is the primary storage format for data warehousing. >>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted >>>data that we query across a wide range of tools including Apache Hive, >>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The >>>performance benefit of columnar projection and statistics is a game changer >>>for our big data platform," said Daniel Week, Software Engineer at Netflix. >>>"We look forward to working with the Apache community to advance the state >>>of big data storage with Parquet and are excited to see the project graduate >>>to full Apache status." >>> >>>"I was extremely happy to see Parquet arrive as an Incubator project," said >>>Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, >>>Instrument and Science Data Systems Section at NASA Jet Propulsion >>>Laboratory. "After talking with some in its community there was a real match >>>with >>>this columnar data format technology and its community with the way that we >>>do things here at the ASF. Parquet has had an exemplar Incubation, and the >>>project has big things ahead of it. I am encouraging my Data Science Team at >>>NASA to evaluate it for data representation especially >>>as it relates to our science holdings in Earth, planetary and space >>>sciences, and astrophysics." >>> >>> >>>Stripe? @cra reached out to Avi, said he would get something by Monday >>>Criteo? >>> >>>@@CLOSING QUOTE FROM JULIEN? >>> >>>Availability and Oversight >>>Apache Parquet software is released under the Apache License v2.0 and is >>>overseen by a self-selected team of active contributors to the project. A >>>Project Management Committee (PMC) guides the Project's day-to-day >>>operations, including community development and product releases. For >>>downloads, documentation, and ways to become involved with Apache Parquet, >>>visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet >>> >>>About the Apache Incubator >>>The Apache Incubator is the entry path for projects and codebases wishing to >>>become part of the efforts at The Apache Software Foundation. All code >>>donations from external organizations and existing external projects wishing >>>to join the ASF enter through the Incubator to: 1) ensure all donations are >>>in accordance with the ASF legal standards; and 2) develop new communities >>>that adhere to our guiding principles. Incubation is required of all newly >>>accepted projects until a further review indicates that the infrastructure, >>>communications, and decision making process have stabilized in a manner >>>consistent with other successful ASF projects. While incubation status is >>>not necessarily a reflection of the completeness or stability of the code, >>>it does indicate that the project has yet to be fully endorsed by the ASF. >>>For more information, visit http://incubator.apache.org/. >>> >>>About The Apache Software Foundation (ASF) >>>Established in 1999, the all-volunteer Foundation oversees more than 350 >>>leading Open Source projects, including Apache HTTP Server --the world's >>>most popular Web server software. Through the ASF's meritocratic process >>>known as "The Apache Way," more than 500 individual Members and 4,500 >>>Committers successfully collaborate to develop freely available >>>enterprise-grade software, benefiting millions of users worldwide: thousands >>>of software solutions are distributed under the Apache License; and the >>>community actively participates in ASF mailing lists, mentoring initiatives, >>>and ApacheCon, the Foundation's official user conference, trainings, and >>>expo. The ASF is a US 501(c)(3) charitable organization, funded by >>>individual donations and corporate sponsors including Bloomberg, Budget >>>Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, >>>HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, >>>Produban, WANdisco, and Yahoo. For more information, visit >>>http://www.apache.org/ or follow @TheASF on Twitter. >>> >>>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", >>>"Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", >>>"Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", >>>"Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or >>>trademarks of the Apache Software Foundation in the United States and/or >>>other countries. All other brands and trademarks are the property of their >>>respective owners. >>> >>># # # >>> >>> >>>________________________________ >>> >>>From: Chris Aniszczyk <[email protected]> >>>To: "[email protected]" <[email protected]> >>>Cc: Sally Khudairi <[email protected]>; Ryan Blue <[email protected]>; >>>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" >>><[email protected]>; "[email protected]" <[email protected]> >>>Sent: Wednesday, 22 April 2015, 14:51 >>>Subject: Re: Graduation blog post? >>> >>> >>> >>>Thanks Daniel, I added your quote. >>> >>> >>> >>> >>>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <[email protected]> >>>wrote: >>> >>>Netflix Testimonial: >>>> >>>>At Netflix, Parquet is the primary storage format for data warehousing. >>>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted >>>>data that we query across a wide range of tools including Apache Hive, >>>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The >>>>performance benefit of columnar projection and statistics is a game changer >>>>for our big data platform. We look forward to working with the Apache >>>>community to advance the state of big data storage with Parquet and are >>>>excited to see the project graduate to full Apache status. >>>> >>>>Daniel Weeks >>>>Engineering Manager - Big Data Compute >>>>Neflix >>>> >>>> >>>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi < >>>>[email protected]> wrote: >>>> >>>>> Thanks for the draft thus far, Ryan. >>>>> Can we please include at least one more industry testimonial? >>>>> Also, if you can please provide edit access to my account at >>>>> [email protected], that would be great. >>>>> Thanks in advance for this! >>>>> -Sally >>>>> >>>>> >>>>> From: Ryan Blue <[email protected]> >>>>> To: [email protected]; Sally Khudairi <[email protected]> >>>>> Cc: "Mattmann, Chris A (3980)" <[email protected]>; " >>>>> [email protected]" <[email protected]>; "[email protected]" < >>>>> [email protected]> >>>>> Sent: Monday, 20 April 2015, 15:48 >>>>> Subject: Re: Graduation blog post? >>>>> >>>>> On 04/20/2015 12:36 PM, Jake Farrell wrote: >>>>> > Hey Sally >>>>> > i've got root@ karma and will take care of the infra side of things for >>>>> > us once the board has successfully voted on our resolution >>>>> > >>>>> > -Jake >>>>> >>>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up >>>>> with this news so they don't worry about it. >>>>> >>>>> rb >>>>> >>>>> >>>>> -- >>>>> Ryan Blue >>>>> Software Engineer >>>>> Cloudera, Inc. >>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >>>-- >>> >>>Cheers, >>> >>>Chris Aniszczyk >>>http://aniszczyk.org >>>+1 512 961 6719 >>> >> >> >
