[drill-site] 02/02: Remove duplicated Drill 1.21 blog post and add cgivre to authors.json.

dzamo Fri, 03 Mar 2023 03:26:19 -0800

This is an automated email from the ASF dual-hosted git repository.

dzamo pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/drill-site.git


commit 652aa27c131a2e70596c78704b33fe1ef039ee31
Author: James Turton <[email protected]>
AuthorDate: Fri Mar 3 13:24:16 2023 +0200

    Remove duplicated Drill 1.21 blog post and add cgivre to authors.json.
---
 _data/authors.json                                 |  8 ++-
 blog/_posts/en/2023-02-21-drill-1.21.0-released.md | 23 -------
 .../en/2023-03-02-drill-1.21-announcement.md       | 73 ----------------------
 3 files changed, 7 insertions(+), 97 deletions(-)

diff --git a/_data/authors.json b/_data/authors.json
index b1e58c740..a2caeec34 100644
--- a/_data/authors.json
+++ b/_data/authors.json
@@ -58,5 +58,11 @@
     "title": "Committer",
     "org": "Apache Software Foundation",
     "email": "[email protected]"
-  }
+  },
+  "cgivre": {
+    "name": "Charles Givre",
+    "title": "PMC Chair and Committer",
+    "org": "Apache Drill",
+    "email": "[email protected]"
+  },
 }
diff --git a/blog/_posts/en/2023-02-21-drill-1.21.0-released.md 
b/blog/_posts/en/2023-02-21-drill-1.21.0-released.md
deleted file mode 100644
index 54a619092..000000000
--- a/blog/_posts/en/2023-02-21-drill-1.21.0-released.md
+++ /dev/null
@@ -1,23 +0,0 @@
----
-layout: post
-title: "Drill 1.21.0 Released"
-code: drill-1.21.0-released
-excerpt: Apache Drill 1.21.0 has been released.
-
-authors: ["jturton"]
----
-
-Today, we're happy to announce the availability of Drill 1.21.0. You can 
download it [here](https://drill.apache.org/download/).
-
-## This release provides the following new features:
-
-* A major upgrade of the parsing and planning library Calcite from 1.21 to 
1.33 enabled by the elimination of Drill's fork of Calcite.
-* Upgrades of most format plugins to the internal EVF2 reader framework 
included support for provided schemas.
-* A new native Drill storage plugin enabling "Drill-on-Drill" federated 
deployments.
-* INSERT support, currently in the JDBC, Splunk and Google Sheets plugins.
-* New SQL syntax including filtered aggregates, PIVOT, UNPIVOT, INTERSECT and 
EXCEPT.
-* Support for new authentication modes in storage plugins including user 
translation for using different external credentials for different Drill users.
-* An overhaul of the implicit type casting logic for a more consistent user 
experience.
-* New functions and storage plugins including Delta Lake, Google Sheets, MS 
Access, threat hunting functions and statistical distribution functions.
-
-You can find a complete list of improvements and JIRAs resolved in the 1.21.0 
release [here]({{site.baseurl}}/docs/apache-drill-1-21-0-release-notes/).
diff --git a/blog/_posts/en/2023-03-02-drill-1.21-announcement.md 
b/blog/_posts/en/2023-03-02-drill-1.21-announcement.md
deleted file mode 100644
index f610b3342..000000000
--- a/blog/_posts/en/2023-03-02-drill-1.21-announcement.md
+++ /dev/null
@@ -1,73 +0,0 @@
----
-layout: post
-title: "Announcing Drill 1.21!"
-code: drill-1.21-announcement
-excerpt: "Announcing Drill 1.21: New Connectors, Functions and Much Better 
Stability."
-
-authors: ["cgivre"]
----
-
-
-# Announcing Drill 1.21: New Connectors, Functions and Much Better Stability
-The Apache Drill PMC is pleased to announce a milestone release of Apache 
Drill. Since the last release of Drill the team has been hard at work quashing 
bugs and making overall functionality improvements. The TL;DR includes the 
following:
-
-* New connectors including Apache Iceberg, Delta Lake, Microsoft Access, 
GoogleSheets, and Box
-* Efficient cross-cloud query capability
-* Greatly improved access controls to include user translation support for all 
storage plugins
-* Greatly improved query planning and implicit casting.
-* New BI-focused SQL operators including `PIVOT`, `UNPIVOT`, `EXCEPT` and 
`INTERSECT`
-* New functions for computing regression lines and trends.
-* New and updated date manipulation functions.
-
-Overall, Drill 1.21 is much more capable and stable than previous versions. 
-
-## Calcite, We’re Back!
-Drill relies on another open source project, Apache Calcite for its query 
planning. The query planning process is a huge part of the overall 
functionality of Drill. Unfortunately, about three years ago, there were some 
issues in Calcite which forced Drill to fork it and rely on that fork. As a 
result, Drill was essentially stuck with a three year old query planner, but 
more importantly, bugs that were fixed in Calcite, as well as new capabilities 
were not finding their way into Drill. 
-
-That is no longer the case. Drill 1.21 is now running on the latest stable 
version of Calcite, version 1.33. As a result, we’ve been able to close 
countless JIRA tickets of various queries failing and other random bugs that 
were the result of query planning bugs.
-
-What this means for you as a user is that you’ll see much fewer queries 
failing and better overall performance in terms of speed and stability. You’ll 
see better optimizations being pushed down to JDBC data sources as well as 
support for BigQuery, Athena and other JDBC data sources. We hope to keep Drill 
away from Calcite forks so I hope that we will work with the Calcite community 
to keep our tools in sync.
-
-## Improved Implicit Casting Rules Reduce Schema Change Failures
-From this author’s perspective, one of the biggest improvements in Drill is 
one of the least noticeable and that is the result of improved implicit 
casting. One of Drill’s unique features is its ability to infer the structure, 
or schema of your data. However, this can be problematic when the schema 
changes. When I used to teach Drill, I used to have spend a considerable amount 
of time teaching students how to cast data from one data type to another to 
ensure that the queries would succeed.
-
-When using latest version of Drill, you’ll find that queries will work without 
the need for much if any casting. In short, they’ll do what you expect them to 
do. It’s really a high on magic functionality. 
-
-## Integrations with the Modern and Not-so-Modern Data Stack
-The new version of Drill features several new connectors and readers that will 
enable users to connect to the “modern data stack”, specifically support for 
Apache Iceberg and Delta Lake. 
-
-### Breaking the Iceberg
-Iceberg is a high-performance format for huge analytic tables. Iceberg brings 
the reliability and simplicity of SQL tables to big data, while making it 
possible for engines like Drill to safely work with the same tables, at the 
same time. In addition to being able to query data directly from Iceberg 
tables, Drill also allows users to query the Iceberg table metadata as well as 
snapshots.  [Complete documentation is available 
here](https://drill.apache.org/docs/iceberg-format-plugin/).
-
-### Querying Delta Lake
-Lest we offend someone, we’re not going to get into the debate between Iceberg 
and Delta lake (after all, let’s not argue about who killed whom), but Delta 
Lake, if you aren’t familiar with it, is another modern table format which 
allows ACID transactions, versioning etc. In version 1.21, Drill adds support 
for Delta Lake tables, so users can query Delta Lake tables as well as 
associated metadata. You can also query specific versions of files in delta 
lake.  [Complete documentation is av [...]
-
-### Accessing Access
-A surprising number of people use Microsoft Access as a database for their 
business data. With version 1.21, Apache Drill can now natively query Microsoft 
Access database files using Drill. This can be a major benefit for those 
looking to migrate data from Access into more modern formats such as parquet or 
even other relational databases. Drill will support Access files from version 
1997 and up. 
-
-### Oh Sheets!
-In addition to all of the above, Drill can now query data directly from 
GoogleSheets. In addition to being able to query this data source, Drill can 
read, write, delete and append to GoogleSheets. Google doesn’t make it easy, so 
if this is a feature you are interested in, you’ll definitely want to [read the 
documentation 
here](https://drill.apache.org/docs/google-sheets-storage-plugin/).
-
-### Remote Data
-As you can see, Drill has significantly expanded the number of data sources 
and types that it can query. A part of this work has also been to improve the 
implementation behind filesystems. As a result, Drill can now query data stored 
on Dropbox, and Box. We added support for filesystems which use OAuth 2.0 for 
authorization so this means that more extended file systems are likely coming 
your way for the next release.
-
-## Greatly Improved Access Controls
-Managing access controls and credentials on a federated query engine is a 
complicated task. Drill has supported a concept called user impersonation which 
basically means that Drill can execute queries using the credentials of the 
logged in user. This concept works well for querying file systems such as 
Hadoop, and other data sources that have the same concepts, however it does not 
work at all with data sources that have different concepts of users, or in the 
case of OAuth enabled plugins [...]
-
-To answer this challenge, Drill 1.21 introduces the concept of user 
translation. The idea of user translation is that, when enabled, every user 
will have their own unique credentials for specific data sources. Thus, when 
that user queries a specific data source, that user’s credentials are used to 
execute the query. This is configurable on an individual data source basis. 
Ultimately, what this means is that you no longer have to create service 
accounts to access data via Drill. 
-
-## Drilling Across the Clouds
-While we’re on the subject of clouds, as you may be aware, Drill can query 
data stored in cloud-based file systems such as S3, Azure, GCP etc. One of the 
challenges however, is that if you have data stored in multiple clouds, it can 
become very inefficient to query this data, especially from the perspective of 
network IO. As of Drill 1.21, Drill adds a storage plugin which we are calling 
Drill on Drill.
-
-Let’s say that you had a Drill cluster in S3, but you had data in both S3 and 
Azure. With the new Drill on Drill capability, you could install an additional 
Drill cluster in Azure, then query both from either Drill cluster. The 
advantage is that the queries would be pushed down to the Drill cluster where 
the data resides. So if you query Azure from S3, you aren’t sending tons of 
data back and forth. 
-
-## Drill Now Supports More BI Operators
-While Drill held more or less to the SQL standard, it was missing some BI 
operators that had become commonplace among SQL platforms. Drill 1.21 
introduces the `PIVOT`, and `UNPIVOT` operators which covert rows to columns or 
vice versa, much in the same way a pivot table works in Excel. Additionally, we 
added set operators `INTERSECT` and `EXCEPT` which have become part of the SQL 
standard.
-
-## New Statistical Functions
-Drill 1.21 adds new SQL functions for statistical summaries including 
`kendall_correlation` for calculating correlation coefficients, `width_bucket` 
which is a SQL function for computing histograms and distributions, and two 
other functions for computing regression lines. 
-
-Lastly, we’ve also added additional date/time manipulation functions which 
will make working with dates significantly easier. 
-
-## What’s Next?
-The big question is where do we go from here? We’ve already started working on 
adding support for additional BI operators such as `CUBE`, `GROUPING SETS` and 
`ROLLUP`, as well as `REGEXP_EXTRACT`. Since the new version of Calcite has 
support for numerous optimizations around materialized views this is also 
something which is being discussed. If you like what you are seeing, please 
download Drill and try it out. Feedback is always welcome on the Drill slack 
channel or on our mailing lists [...]

[drill-site] 02/02: Remove duplicated Drill 1.21 blog post and add cgivre to authors.json.

Reply via email to