This is an automated email from the ASF dual-hosted git repository.

jakevin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git


The following commit(s) were added to refs/heads/master by this push:
     new 4bea81b5d Document ability to select directly from files in 
datafusion-cli (#4851)
4bea81b5d is described below

commit 4bea81b5d1c7b2f81cc6c140abc7d927220bec91
Author: Andrew Lamb <and...@nerdnetworks.org>
AuthorDate: Mon Jan 9 09:16:21 2023 -0500

    Document ability to select directly from files in datafusion-cli (#4851)
    
    * Document ability to select directly from files in datafusion-cli
    
    * prettier
    
    * Update docs/source/user-guide/cli.md
    
    Co-authored-by: Liang-Chi Hsieh <vii...@gmail.com>
    
    Co-authored-by: Liang-Chi Hsieh <vii...@gmail.com>
---
 docs/source/user-guide/cli.md | 63 +++++++++++++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 17 deletions(-)

diff --git a/docs/source/user-guide/cli.md b/docs/source/user-guide/cli.md
index 3a4c453a7..d3512a6dc 100644
--- a/docs/source/user-guide/cli.md
+++ b/docs/source/user-guide/cli.md
@@ -19,30 +19,51 @@
 
 # DataFusion Command-line SQL Utility
 
-The DataFusion CLI is a command-line interactive SQL utility that allows
-queries to be executed against any supported data files. It is a convenient 
way to
+The DataFusion CLI is a command-line interactive SQL utility for executing
+queries against any supported data files. It is a convenient way to
 try DataFusion out with your own data sources, and test out its SQL support.
 
 ## Example
 
 Create a CSV file to query.
 
-```bash
-$ echo "1,2" > data.csv
+```shell
+$ echo "a,b" > data.csv
+$ echo "1,2" >> data.csv
 ```
 
-```bash
+Query that single file (the CLI also supports parquet, compressed csv, avro, 
json and more)
+
+```shell
 $ datafusion-cli
-DataFusion CLI v12.0.0
-❯ CREATE EXTERNAL TABLE foo STORED AS CSV LOCATION 'data.csv';
-0 rows in set. Query took 0.017 seconds.
-❯ select * from foo;
-+----------+----------+
-| column_1 | column_2 |
-+----------+----------+
-| 1        | 2        |
-+----------+----------+
-1 row in set. Query took 0.012 seconds.
+DataFusion CLI v17.0.0
+❯ select * from 'data.csv';
++---+---+
+| a | b |
++---+---+
+| 1 | 2 |
++---+---+
+1 row in set. Query took 0.007 seconds.
+```
+
+You can also query directories of files with compatible schemas:
+
+```shell
+$ ls data_dir/
+data.csv   data2.csv
+```
+
+```shell
+$ datafusion-cli
+DataFusion CLI v16.0.0
+❯ select * from 'data_dir';
++---+---+
+| a | b |
++---+---+
+| 3 | 4 |
+| 1 | 2 |
++---+---+
+2 rows in set. Query took 0.007 seconds.
 ```
 
 ## Installation
@@ -87,6 +108,8 @@ docker run -it -v $(your_data_location):/data datafusion-cli
 
 ## Usage
 
+See the current usage using `datafusion-cli --help`:
+
 ```bash
 Apache Arrow <d...@arrow.apache.org>
 Command Line Client for DataFusion query engine.
@@ -104,10 +127,16 @@ OPTIONS:
     -q, --quiet                      Reduce printing other than the results 
and work quietly
     -r, --rc <RC>...                 Run the provided files on startup instead 
of ~/.datafusionrc
     -V, --version                    Print version information
-
-Type `exit` or `quit` to exit the CLI.
 ```
 
+## Selecting files directly
+
+Files can be queried directly by enclosing the file or
+directory name in single `'` quotes as shown in the example.
+
+It is also possible to create a table backed by files by explicitly
+via `CREATE EXTERNAL TABLE` as shown below.
+
 ## Registering Parquet Data Sources
 
 Parquet data sources can be registered by executing a `CREATE EXTERNAL TABLE` 
SQL statement. It is not necessary to provide schema information for Parquet 
files.

Reply via email to