[arrow-datafusion] branch asf-site updated: Publish built docs triggered by af97ac886c425efefb0536c5344894703f65d7fa

github-bot Wed, 22 Mar 2023 03:57:08 -0700

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new f812e0a4c2 Publish built docs triggered by 
af97ac886c425efefb0536c5344894703f65d7fa
f812e0a4c2 is described below

commit f812e0a4c28ff75fd88a0054d33b333eba530433
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Wed Mar 22 10:55:56 2023 +0000

    Publish built docs triggered by af97ac886c425efefb0536c5344894703f65d7fa
---
 _sources/user-guide/introduction.md.txt | 51 +++++++++++++++++++++++------
 searchindex.js                          |  2 +-
 user-guide/introduction.html            | 58 +++++++++++++++++++++++++++------
 3 files changed, 90 insertions(+), 21 deletions(-)

diff --git a/_sources/user-guide/introduction.md.txt 
b/_sources/user-guide/introduction.md.txt
index 64b6be9d28..5e6859f8d6 100644
--- a/_sources/user-guide/introduction.md.txt
+++ b/_sources/user-guide/introduction.md.txt
@@ -19,21 +19,52 @@
 
 # Introduction
 
-DataFusion is an extensible query execution framework, written in
-Rust, that uses [Apache Arrow](https://arrow.apache.org) as its
+DataFusion is a very fast, extensible query engine for building high-quality 
data-centric systems in
+[Rust](http://rustlang.org), using the [Apache Arrow](https://arrow.apache.org)
 in-memory format.
 
-DataFusion supports SQL and a DataFrame API for building logical query
-plans, an extensive query optimizer, and a multi-threaded parallel
-execution execution engine for processing partitioned data sources
-such as CSV and Parquet files extremely quickly.
+DataFusion offers SQL and Dataframe APIs, excellent 
[performance](https://benchmark.clickhouse.com/), built-in support for CSV, 
Parquet, JSON, and Avro, extensive customization, and a great community.
+
+## Features
+
+- Feature-rich [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/index.html) and 
[DataFrame API](https://arrow.apache.org/datafusion/user-guide/dataframe.html)
+- Blazingly fast, vectorized, multi-threaded, streaming execution engine.
+- Native support for Parquet, CSV, JSON, and Avro file formats. Support
+  for custom file formats and non file datasources via the `TableProvider` 
trait.
+- Many extension points: user defined scalar/aggregate/window functions, 
DataSources, SQL,
+  other query languages, custom plan and execution nodes, optimizer passes, 
and more.
+- Streaming, asynchronous IO directly from popular object stores, including 
AWS S3,
+  Azure Blob Storage, and Google Cloud Storage. Other storage systems are 
supported via the
+  `ObjectStore` trait.
+- [Excellent Documentation](https://docs.rs/datafusion/latest) and a
+  [welcoming 
community](https://arrow.apache.org/datafusion/contributor-guide/communication.html).
+- A state of the art query optimizer with projection and filter pushdown, sort 
aware optimizations,
+  automatic join reordering, expression coercion, and more.
+- Permissive Apache 2.0 License, Apache Software Foundation governance
+- Written in [Rust](https://www.rust-lang.org/), a modern system language with 
development
+  productivity similar to Java or Golang, the performance of C++, and
+  [loved by programmers 
everywhere](https://insights.stackoverflow.com/survey/2021#technology-most-loved-dreaded-and-wanted).
+- Support for [Substrait](https://substrait.io/) for query plan serialization, 
making it easier to integrate DataFusion
+  with other projects, and to pass plans across language boundaries.
 
 ## Use Cases
 
-DataFusion is used to create modern, fast and efficient data
-pipelines, ETL processes, and database systems, which need the
-performance of Rust and Apache Arrow and want to provide their users
-the convenience of an SQL interface or a DataFrame API.
+DataFusion can be used without modification as an embedded SQL
+engine or can be customized and used as a foundation for
+building new systems. Here are some examples of systems built using DataFusion:
+
+- Specialized Analytical Database systems such as [CeresDB] and more general 
Apache Spark like system such a [Ballista].
+- New query language engines such as [prql-query] and accelerators such as 
[VegaFusion]
+- Research platform for new Database Systems, such as [Flock]
+- SQL support to another library, such as [dask sql]
+- Streaming data platforms such as [Synnada]
+- Tools for reading / sorting / transcoding Parquet, CSV, AVRO, and JSON files 
such as [qv]
+- A faster Spark runtime replacement [Blaze]
+
+By using DataFusion, the projects are freed to focus on their specific
+features, and avoid reimplementing general (but still necessary)
+features such as an expression representation, standard optimizations,
+execution plans, file format support, etc.
 
 ## Why DataFusion?
 
diff --git a/searchindex.js b/searchindex.js
index 5abefb11d6..3fe8cf7415 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"docnames": ["contributor-guide/communication", 
"contributor-guide/index", "contributor-guide/quarterly_roadmap", 
"contributor-guide/roadmap", "contributor-guide/specification/index", 
"contributor-guide/specification/invariants", 
"contributor-guide/specification/output-field-name-semantic", "index", 
"user-guide/cli", "user-guide/configs", "user-guide/dataframe", 
"user-guide/example-usage", "user-guide/expressions", "user-guide/faq", 
"user-guide/introduction", "user-guide [...]
\ No newline at end of file
+Search.setIndex({"docnames": ["contributor-guide/communication", 
"contributor-guide/index", "contributor-guide/quarterly_roadmap", 
"contributor-guide/roadmap", "contributor-guide/specification/index", 
"contributor-guide/specification/invariants", 
"contributor-guide/specification/output-field-name-semantic", "index", 
"user-guide/cli", "user-guide/configs", "user-guide/dataframe", 
"user-guide/example-usage", "user-guide/expressions", "user-guide/faq", 
"user-guide/introduction", "user-guide [...]
\ No newline at end of file
diff --git a/user-guide/introduction.html b/user-guide/introduction.html
index a8a2d662a8..17786939a3 100644
--- a/user-guide/introduction.html
+++ b/user-guide/introduction.html
@@ -257,6 +257,11 @@
 
 <nav id="bd-toc-nav">
     <ul class="visible nav section-nav flex-column">
+ <li class="toc-h2 nav-item toc-entry">
+  <a class="reference internal nav-link" href="#features">
+   Features
+  </a>
+ </li>
  <li class="toc-h2 nav-item toc-entry">
   <a class="reference internal nav-link" href="#use-cases">
    Use Cases
@@ -315,19 +320,52 @@
 -->
 <section id="introduction">
 <h1>Introduction<a class="headerlink" href="#introduction" title="Permalink to 
this heading">¶</a></h1>
-<p>DataFusion is an extensible query execution framework, written in
-Rust, that uses <a class="reference external" 
href="https://arrow.apache.org";>Apache Arrow</a> as its
+<p>DataFusion is a very fast, extensible query engine for building 
high-quality data-centric systems in
+<a class="reference external" href="http://rustlang.org";>Rust</a>, using the 
<a class="reference external" href="https://arrow.apache.org";>Apache Arrow</a>
 in-memory format.</p>
-<p>DataFusion supports SQL and a DataFrame API for building logical query
-plans, an extensive query optimizer, and a multi-threaded parallel
-execution execution engine for processing partitioned data sources
-such as CSV and Parquet files extremely quickly.</p>
+<p>DataFusion offers SQL and Dataframe APIs, excellent <a class="reference 
external" href="https://benchmark.clickhouse.com/";>performance</a>, built-in 
support for CSV, Parquet, JSON, and Avro, extensive customization, and a great 
community.</p>
+<section id="features">
+<h2>Features<a class="headerlink" href="#features" title="Permalink to this 
heading">¶</a></h2>
+<ul class="simple">
+<li><p>Feature-rich <a class="reference external" 
href="https://arrow.apache.org/datafusion/user-guide/sql/index.html";>SQL 
support</a> and <a class="reference external" 
href="https://arrow.apache.org/datafusion/user-guide/dataframe.html";>DataFrame 
API</a></p></li>
+<li><p>Blazingly fast, vectorized, multi-threaded, streaming execution 
engine.</p></li>
+<li><p>Native support for Parquet, CSV, JSON, and Avro file formats. Support
+for custom file formats and non file datasources via the <code class="docutils 
literal notranslate"><span class="pre">TableProvider</span></code> 
trait.</p></li>
+<li><p>Many extension points: user defined scalar/aggregate/window functions, 
DataSources, SQL,
+other query languages, custom plan and execution nodes, optimizer passes, and 
more.</p></li>
+<li><p>Streaming, asynchronous IO directly from popular object stores, 
including AWS S3,
+Azure Blob Storage, and Google Cloud Storage. Other storage systems are 
supported via the
+<code class="docutils literal notranslate"><span 
class="pre">ObjectStore</span></code> trait.</p></li>
+<li><p><a class="reference external" 
href="https://docs.rs/datafusion/latest";>Excellent Documentation</a> and a
+<a class="reference external" 
href="https://arrow.apache.org/datafusion/contributor-guide/communication.html";>welcoming
 community</a>.</p></li>
+<li><p>A state of the art query optimizer with projection and filter pushdown, 
sort aware optimizations,
+automatic join reordering, expression coercion, and more.</p></li>
+<li><p>Permissive Apache 2.0 License, Apache Software Foundation 
governance</p></li>
+<li><p>Written in <a class="reference external" 
href="https://www.rust-lang.org/";>Rust</a>, a modern system language with 
development
+productivity similar to Java or Golang, the performance of C++, and
+<a class="reference external" 
href="https://insights.stackoverflow.com/survey/2021#technology-most-loved-dreaded-and-wanted";>loved
 by programmers everywhere</a>.</p></li>
+<li><p>Support for <a class="reference external" 
href="https://substrait.io/";>Substrait</a> for query plan serialization, making 
it easier to integrate DataFusion
+with other projects, and to pass plans across language boundaries.</p></li>
+</ul>
+</section>
 <section id="use-cases">
 <h2>Use Cases<a class="headerlink" href="#use-cases" title="Permalink to this 
heading">¶</a></h2>
-<p>DataFusion is used to create modern, fast and efficient data
-pipelines, ETL processes, and database systems, which need the
-performance of Rust and Apache Arrow and want to provide their users
-the convenience of an SQL interface or a DataFrame API.</p>
+<p>DataFusion can be used without modification as an embedded SQL
+engine or can be customized and used as a foundation for
+building new systems. Here are some examples of systems built using 
DataFusion:</p>
+<ul class="simple">
+<li><p>Specialized Analytical Database systems such as [CeresDB] and more 
general Apache Spark like system such a [Ballista].</p></li>
+<li><p>New query language engines such as [prql-query] and accelerators such 
as [VegaFusion]</p></li>
+<li><p>Research platform for new Database Systems, such as [Flock]</p></li>
+<li><p>SQL support to another library, such as [dask sql]</p></li>
+<li><p>Streaming data platforms such as [Synnada]</p></li>
+<li><p>Tools for reading / sorting / transcoding Parquet, CSV, AVRO, and JSON 
files such as [qv]</p></li>
+<li><p>A faster Spark runtime replacement [Blaze]</p></li>
+</ul>
+<p>By using DataFusion, the projects are freed to focus on their specific
+features, and avoid reimplementing general (but still necessary)
+features such as an expression representation, standard optimizations,
+execution plans, file format support, etc.</p>
 </section>
 <section id="why-datafusion">
 <h2>Why DataFusion?<a class="headerlink" href="#why-datafusion" 
title="Permalink to this heading">¶</a></h2>

[arrow-datafusion] branch asf-site updated: Publish built docs triggered by af97ac886c425efefb0536c5344894703f65d7fa

Reply via email to