Perfect. Thank you, Julien!
I'll confirm once we're live tomorrow morning.
Warmly,Sally
From: Julien Le Dem <[email protected]>
To: Sally Khudairi <[email protected]>
Cc: "[email protected]" <[email protected]>;
Sally Khudairi <[email protected]>; Daniel Weeks <[email protected]>; Chris
Aniszczyk <[email protected]>; Ryan Blue <[email protected]>;
"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)"
<[email protected]>; "[email protected]" <[email protected]>
Sent: Sunday, 26 April 2015, 19:21
Subject: Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation
blog post?]
Sounds good.Thank you!
On Sunday, April 26, 2015, Sally Khudairi <[email protected]> wrote:
Thanks, Julien --I can include that, yes.
Does this work for you?
<snip>
Catch Apache Parquet in action at the Hadoop Summit, 9-11 June 2015 in San
Jose, California. The Apache Parquet project welcomes contributions and
community participation through mailing lists, face-to-face MeetUps, and user
events. For more information, visit http://parquet.apache.org/community/
</snip>
Warmest regards,
Sally
Did you want to mention the parquet talks at the Hadoop summit in June?
Otherwise this looks good to me.
On Sunday, April 26, 2015, Sally Khudairi <[email protected]>
wrote:
Hi everyone --I haven't received any other feedback, so I think we're all set
to announce tomorrow.
>I'd like to issue the press release at at 7AM ET. I'll confirm when we're live.
>If there are any showstoppers, please let me know ASAP.
>Thanks so much,Sally
>
> From: Sally Khudairi <[email protected]>
> To: Sally Khudairi <[email protected]>; Daniel Weeks
> <[email protected]>; "[email protected]"
> <[email protected]>
>Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>;
>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)"
><[email protected]>; "[email protected]" <[email protected]>
> Sent: Friday, 24 April 2015, 16:17
> Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog
> post?]
>
>Hello again, everyone --below is the latest draft.
>
>Please review and forward any changes/additions no later than 5PM ET on Sunday
>in order for us to announce on Monday morning. I was aiming to go live by 7AM
>ET if that works for you.
>
>Kindly confirm.
>
>Thanks in advance,
>Sally
>
>= = =
>
>DRAFT :: NOT FOR DISTRIBUTION
>
>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level
>Project
>
>Open Source storage format for the Apache™ Hadoop® ecosystem in use at
>Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations
>
>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the
>all-volunteer developers, stewards, and incubators of more than 350 Open
>Source projects and initiatives, announced today that Apache™ Parquet™ has
>graduated from the Apache Incubator to become a Top-Level Project (TLP),
>signifying that the project's community and products have been well-governed
>under the ASF's meritocratic process and principles.
>
>"The incubation process at Apache has been fantastic and really the last step
>of making Parquet a community driven standard fully integrated within the
>greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache
>Parquet.
>
>Apache Parquet is an Open Source columnar storage format for the Apache™
>Hadoop® ecosystem, built to work across programming languages and much more:
>
>
> - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading,
> Crunch, Kite)
> - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
> - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache
> Pig, Presto, Apache Spark SQL)
>
>"At Twitter, Parquet has helped us scale our big data usage by in some cases
>reducing storage requirements by one third on large datasets as well as scan
>and deserialization time. This translated into hardware savings as well as
>reduced latency for accessing the data. Furthermore, Parquet being integrated
>with so many tools creates opportunities and flexibility regarding query
>engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's
>just fantastic to see it graduate to a top-level project and we look forward
>to further collaborating with the Apache Parquet community to continually
>improve performance."
>
>"Parquet’s integration with other object models, like Avro and Thrift, has
>been a key feature for our customers," said Ryan Blue, Software Engineer at
>Cloudera. "They can take advantage of columnar storage without changing the
>classes they already use in their production applications."
>
>"At Netflix, Parquet is the primary storage format for data warehousing. More
>than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that
>we query across a wide range of tools including Apache Hive, Apache Pig,
>Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit of
>columnar projection and statistics is a game changer for our big data
>platform," said Daniel Weeks, Software Engineer at Netflix. "We look forward
>to working with the Apache community to advance the state of big data storage
>with Parquet and are excited to see the project graduate to full Apache
>status."
>
>"Stripe's data warehouse has been built on Parquet from the beginning," said
>Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from
>data import to machine learning to adhoc SQL analysis, uses Apache Parquet as
>the common interchange format."
>
>"I was extremely happy to see Parquet arrive as an Incubator project," said
>Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect,
>Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory.
>"After talking with some in its community there was a real match with this
>columnar data format technology and its community with the way that we do
>things here at the ASF. Parquet has had an exemplar Incubation, and the
>project has big things ahead of it. I am encouraging my Data Science Team at
>NASA to evaluate it for data representation especially as it relates to our
>science holdings in Earth, planetary and space sciences, and astrophysics."
>
>The Apache Parquet project welcomes contributions and community participation
>through mailing lists, face-to-face MeetUps, and user events. For more
>information, visit http://parquet.apache.org/community/
>
>Availability and Oversight
>Apache Parquet software is released under the Apache License v2.0 and is
>overseen by a self-selected team of active contributors to the project. A
>Project Management Committee (PMC) guides the Project's day-to-day operations,
>including community development and product releases. For downloads,
>documentation, and ways to become involved with Apache Parquet, visit
>http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>
>About the Apache Incubator
>The Apache Incubator is the entry path for projects and codebases wishing to
>become part of the efforts at The Apache Software Foundation. All code
>donations from external organizations and existing external projects wishing
>to join the ASF enter through the Incubator to: 1) ensure all donations are in
>accordance with the ASF legal standards; and 2) develop new communities that
>adhere to our guiding principles. Incubation is required of all newly accepted
>projects until a further review indicates that the infrastructure,
>communications, and decision making process have stabilized in a manner
>consistent with other successful ASF projects. While incubation status is not
>necessarily a reflection of the completeness or stability of the code, it does
>indicate that the project has yet to be fully endorsed by the ASF. For more
>information, visit http://incubator.apache.org/.
>
>About The Apache Software Foundation (ASF)
>Established in 1999, the all-volunteer Foundation oversees more than 350
>leading Open Source projects, including Apache HTTP Server --the world's most
>popular Web server software. Through the ASF's meritocratic process known as
>"The Apache Way," more than 500 individual Members and 4,500 Committers
>successfully collaborate to develop freely available enterprise-grade
>software, benefiting millions of users worldwide: thousands of software
>solutions are distributed under the Apache License; and the community actively
>participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the
>Foundation's official user conference, trainings, and expo. The ASF is a US
>501(c)(3) charitable organization, funded by individual donations and
>corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix,
>Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting,
>iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For
>more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>
>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill",
>"Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig",
>"Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift",
>"Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of
>the Apache Software Foundation in the United States and/or other countries.
>All other brands and trademarks are the property of their respective owners.
>
># # #
>
>[MEDIA CONTACT:SALLY]
>________________________________
>
>
>From: Sally Khudairi <[email protected]>
>To: Sally Khudairi <[email protected]>; Daniel Weeks
><[email protected]>; "[email protected]"
><[email protected]>
>Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>;
>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)"
><[email protected]>; "[email protected]" <[email protected]>
>Sent: Friday, 24 April 2015, 13:56
>Subject: Re: Graduation blog post?
>
>
>
>Done.
>
>ALL: can you please let me know if there are any events that Parquet will be
>at? Presenting? Hosting? etc.
>
>Thank you!
>
>-Sally
>
>
>
>
>
>________________________________
>From: Sally Khudairi <[email protected]>
>To: Daniel Weeks <[email protected]>; "[email protected]"
><[email protected]>
>Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>;
>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)"
><[email protected]>; "[email protected]" <[email protected]>
>Sent: Friday, 24 April 2015, 13:40
>Subject: Re: Graduation blog post?
>
>
>
>Of course --I'll fix that now!
>
>Sorry about that, Daniel.
>
>-Sally
>
>
>
>
>
>
>________________________________
>From: Daniel Weeks <[email protected]>
>To: [email protected]; Sally Khudairi <[email protected]>
>Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>;
>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)"
><[email protected]>; "[email protected]" <[email protected]>
>Sent: Friday, 24 April 2015, 13:38
>Subject: Re: Graduation blog post?
>
>
>
>Sally,
>
>Just wanted to comment that my last name is misspelled in the Netflix
>testimonial. Can someone fix that? (it's Weeks, not Week)
>
>Thanks,
>Dan
>
>
>
>
>On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi
><[email protected]> wrote:
>
>Hi everyone --there's been the addition of a quote from Stripe:
>>
>>"Stripe's data warehouse has been built on Parquet from the beginning," said
>>Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline,
>>from data import to machine learning to adhoc SQL analysis, uses Apache
>>Parquet as the common interchange format."
>>
>>
>>--please note that I added "Apache" to "Parquet" in the second sentence.
>>Stripe has also been added to the sub-head.
>>
>>Are we waiting for quotes from anyone else? If not, I can add a closing
>>sentence and forward the final copy later today.
>>
>>Thanks so much,
>>Sally
>>
>>
>>
>>----- Original Message -----
>>
>>From: Sally Khudairi <[email protected]>
>>To: Chris Aniszczyk <[email protected]>;
>>"[email protected]" <[email protected]>
>>Cc: Ryan Blue <[email protected]>; "[email protected]"
>><[email protected]>; "Mattmann, Chris A (3980)"
>><[email protected]>; "[email protected]" <[email protected]>
>>Sent: Thursday, 23 April 2015, 15:25
>>Subject: Re: Graduation blog post?
>>
>>Hello everyone --below is the draft thus far.
>>
>>
>>I was aiming to announce on Monday by 7AM ET, but noticed that we're waiting
>>for additional quotes.
>>
>>Also, should we get a closing quote from Julien? Perhaps something that
>>invites additional community participation?
>>
>>Please let me know your thoughts.
>>
>>Thanks so much,
>>Sally
>>
>>= = =
>>
>>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level
>>Project
>>
>>Open Source storage format for the Apache™ Hadoop® ecosystem in use at
>>Cloudera, NASA, Netflix, and Twitter, among other organizations
>>
>>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the
>>all-volunteer developers, stewards, and incubators of more than 350 Open
>>Source projects and initiatives, announced today that Apache™ Parquet™ has
>>graduated from the Apache Incubator to become a Top-Level Project (TLP),
>>signifying that the project's community and products have been well-governed
>>under the ASF's meritocratic process and principles.
>>
>>"The incubation process at Apache has been fantastic and really the last step
>>of making Parquet a community driven standard fully integrated within the
>>greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache
>>Parquet.
>>
>>Apache Parquet is an Open Source columnar storage format for the Apache™
>>Hadoop® ecosystem, built to work across programming languages and much more:
>>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading,
>>Crunch, Kite)
>>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
>>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache
>>Pig, Presto, Apache Spark SQL)
>>
>>"At Twitter, Parquet has helped us scale our big data usage by in some cases
>>reducing storage requirements by one third on large datasets as well as scan
>>and deserialization time. This translated into hardware savings as well as
>>reduced latency for accessing the data. Furthermore, Parquet being integrated
>>with so many tools creates opportunities and flexibility regarding query
>>engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally,
>>it's just fantastic to see it graduate to a top-level project and we look
>>forward to further collaborating with the Apache Parquet community to
>>continually improve performance."
>>
>>"Parquet’s integration with other object models, like Avro and Thrift, has
>>been a key feature for our customers," said Ryan Blue, Software Engineer at
>>Cloudera. "They can take advantage of columnar storage without changing the
>>classes they already use in their production applications."
>>
>>"At Netflix, Parquet is the primary storage format for data warehousing. More
>>than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that
>>we query across a wide range of tools including Apache Hive, Apache Pig,
>>Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit
>>of columnar projection and statistics is a game changer for our big data
>>platform," said Daniel Week, Software Engineer at Netflix. "We look forward
>>to working with the Apache community to advance the state of big data storage
>>with Parquet and are excited to see the project graduate to full Apache
>>status."
>>
>>"I was extremely happy to see Parquet arrive as an Incubator project," said
>>Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect,
>>Instrument and Science Data Systems Section at NASA Jet Propulsion
>>Laboratory. "After talking with some in its community there was a real match
>>with
>>this columnar data format technology and its community with the way that we
>>do things here at the ASF. Parquet has had an exemplar Incubation, and the
>>project has big things ahead of it. I am encouraging my Data Science Team at
>>NASA to evaluate it for data representation especially
>>as it relates to our science holdings in Earth, planetary and space sciences,
>>and astrophysics."
>>
>>
>>Stripe? @cra reached out to Avi, said he would get something by Monday
>>Criteo?
>>
>>@@CLOSING QUOTE FROM JULIEN?
>>
>>Availability and Oversight
>>Apache Parquet software is released under the Apache License v2.0 and is
>>overseen by a self-selected team of active contributors to the project. A
>>Project Management Committee (PMC) guides the Project's day-to-day
>>operations, including community development and product releases. For
>>downloads, documentation, and ways to become involved with Apache Parquet,
>>visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>>
>>About the Apache Incubator
>>The Apache Incubator is the entry path for projects and codebases wishing to
>>become part of the efforts at The Apache Software Foundation. All code
>>donations from external organizations and existing external projects wishing
>>to join the ASF enter through the Incubator to: 1) ensure all donations are
>>in accordance with the ASF legal standards; and 2) develop new communities
>>that adhere to our guiding principles. Incubation is required of all newly
>>accepted projects until a further review indicates that the infrastructure,
>>communications, and decision making process have stabilized in a manner
>>consistent with other successful ASF projects. While incubation status is not
>>necessarily a reflection of the completeness or stability of the code, it
>>does indicate that the project has yet to be fully endorsed by the ASF. For
>>more information, visit http://incubator.apache.org/.
>>
>>About The Apache Software Foundation (ASF)
>>Established in 1999, the all-volunteer Foundation oversees more than 350
>>leading Open Source projects, including Apache HTTP Server --the world's most
>>popular Web server software. Through the ASF's meritocratic process known as
>>"The Apache Way," more than 500 individual Members and 4,500 Committers
>>successfully collaborate to develop freely available enterprise-grade
>>software, benefiting millions of users worldwide: thousands of software
>>solutions are distributed under the Apache License; and the community
>>actively participates in ASF mailing lists, mentoring initiatives, and
>>ApacheCon, the Foundation's official user conference, trainings, and expo.
>>The ASF is a US 501(c)(3) charitable organization, funded by individual
>>donations and corporate sponsors including Bloomberg, Budget Direct, Cerner,
>>Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion
>>Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and
>>Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF
>>on Twitter.
>>
>>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill",
>>"Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet",
>>"Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo",
>>"Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or
>>trademarks of the Apache Software Foundation in the United States and/or
>>other countries. All other brands and trademarks are the property of their
>>respective owners.
>>
>># # #
>>
>>
>>________________________________
>>
>>From: Chris Aniszczyk <[email protected]>
>>To: "[email protected]" <[email protected]>
>>Cc: Sally Khudairi <[email protected]>; Ryan Blue <[email protected]>;
>>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)"
>><[email protected]>; "[email protected]" <[email protected]>
>>Sent: Wednesday, 22 April 2015, 14:51
>>Subject: Re: Graduation blog post?
>>
>>
>>
>>Thanks Daniel, I added your quote.
>>
>>
>>
>>
>>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <[email protected]>
>>wrote:
>>
>>Netflix Testimonial:
>>>
>>>At Netflix, Parquet is the primary storage format for data warehousing.
>>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
>>>data that we query across a wide range of tools including Apache Hive,
>>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The
>>>performance benefit of columnar projection and statistics is a game changer
>>>for our big data platform. We look forward to working with the Apache
>>>community to advance the state of big data storage with Parquet and are
>>>excited to see the project graduate to full Apache status.
>>>
>>>Daniel Weeks
>>>Engineering Manager - Big Data Compute
>>>Neflix
>>>
>>>
>>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi <
>>>[email protected]> wrote:
>>>
>>>> Thanks for the draft thus far, Ryan.
>>>> Can we please include at least one more industry testimonial?
>>>> Also, if you can please provide edit access to my account at
>>>> [email protected], that would be great.
>>>> Thanks in advance for this!
>>>> -Sally
>>>>
>>>>
>>>> From: Ryan Blue <[email protected]>
>>>> To: [email protected]; Sally Khudairi <[email protected]>
>>>> Cc: "Mattmann, Chris A (3980)" <[email protected]>; "
>>>> [email protected]" <[email protected]>; "[email protected]" <
>>>> [email protected]>
>>>> Sent: Monday, 20 April 2015, 15:48
>>>> Subject: Re: Graduation blog post?
>>>>
>>>> On 04/20/2015 12:36 PM, Jake Farrell wrote:
>>>> > Hey Sally
>>>> > i've got root@ karma and will take care of the infra side of things for
>>>> > us once the board has successfully voted on our resolution
>>>> >
>>>> > -Jake
>>>>
>>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up
>>>> with this news so they don't worry about it.
>>>>
>>>> rb
>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Cloudera, Inc.
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>--
>>
>>Cheers,
>>
>>Chris Aniszczyk
>>http://aniszczyk.org
>>+1 512 961 6719
>>
>
>