[GitHub] merlimat closed pull request #2849: fixing/adding sql docs to correct locations

GitBox Thu, 25 Oct 2018 23:35:01 -0700

merlimat closed pull request #2849: fixing/adding sql docs to correct locations
URL: https://github.com/apache/pulsar/pull/2849


This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/site2/docs/sql-deployment-configurations.md 
b/site2/docs/sql-deployment-configurations.md
new file mode 100644
index 0000000000..ac5814cf86
--- /dev/null
+++ b/site2/docs/sql-deployment-configurations.md
@@ -0,0 +1,152 @@
+---
+id: sql-deployment-configurations
+title: Pulsar SQl Deployment and Configuration
+sidebar_label: Deployment and Configuration
+---
+
+Below is a list configurations for the Presto Pulsar connector and instruction 
on how to deploy a cluster.
+
+## Presto Pulsar Connector Configurations
+There are several configurations for the Presto Pulsar Connector.  The 
properties file that contain these configurations can be found at 
```${project.root}/conf/presto/catalog/pulsar.properties```.
+The configurations for the connector and its default values are discribed 
below.
+
+```properties
+# name of the connector to be displayed in the catalog
+connector.name=pulsar
+
+# the url of Pulsar broker service
+pulsar.broker-service-url=http://localhost:8080
+
+# URI of Zookeeper cluster
+pulsar.zookeeper-uri=localhost:2181
+
+# minimum number of entries to read at a single time
+pulsar.entry-read-batch-size=100
+
+# default number of splits to use per query
+pulsar.target-num-splits=4
+```
+
+## Query Pulsar from Existing Presto Cluster
+
+If you already have an existing Presto cluster, you can copy Presto Pulsar 
connector plugin to your existing cluster.  You can download the archived 
plugin package via:
+
+```bash
+$ wget pulsar:binary_release_url
+```
+
+## Deploying a new cluster
+
+Please note that the [Getting Started](sql-getting-started.md) guide shows you 
how to easily setup a standalone single node enviroment to experiment with.
+
+Pulsar SQL is powered by [Presto](https://prestodb.io) thus many of the 
configurations for deployment is the same for the Pulsar SQL worker.
+
+You can use the same CLI args as the Presto launcher:
+
+```bash
+$ ./bin/pulsar sql-worker --help
+Usage: launcher [options] command
+
+Commands: run, start, stop, restart, kill, status
+
+Options:
+  -h, --help            show this help message and exit
+  -v, --verbose         Run verbosely
+  --etc-dir=DIR         Defaults to INSTALL_PATH/etc
+  --launcher-config=FILE
+                        Defaults to INSTALL_PATH/bin/launcher.properties
+  --node-config=FILE    Defaults to ETC_DIR/node.properties
+  --jvm-config=FILE     Defaults to ETC_DIR/jvm.config
+  --config=FILE         Defaults to ETC_DIR/config.properties
+  --log-levels-file=FILE
+                        Defaults to ETC_DIR/log.properties
+  --data-dir=DIR        Defaults to INSTALL_PATH
+  --pid-file=FILE       Defaults to DATA_DIR/var/run/launcher.pid
+  --launcher-log-file=FILE
+                        Defaults to DATA_DIR/var/log/launcher.log (only in
+                        daemon mode)
+  --server-log-file=FILE
+                        Defaults to DATA_DIR/var/log/server.log (only in
+                        daemon mode)
+  -D NAME=VALUE         Set a Java system property
+
+```
+
+There is a set of default configs for the cluster located in 
```${project.root}/conf/presto``` that will be used by default.  You can change 
them to customize your deployment
+
+You can also set the worker to read from a different configuration directory 
as well as set a different directory for writing its data:
+
+```bash
+$ ./bin/pulsar sql-worker run --etc-dir /tmp/incubator-pulsar/conf/presto 
--data-dir /tmp/presto-1
+```
+
+You can also start the worker as daemon process:
+
+```bash
+$ ./bin sql-worker start
+```
+
+### Deploying to a 3 node cluster
+
+For example, if I wanted to deploy a Pulsar SQL/Presto cluster on 3 nodes, you 
can do the following:
+
+First, copy the Pulsar binary distribution to all three nodes.
+
+The first node, will run the Presto coordinator.  The mininal configuration in 
```${project.root}/conf/presto/config.properties``` can be the following
+
+```properties
+coordinator=true
+node-scheduler.include-coordinator=true
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery-server.enabled=true
+discovery.uri=<coordinator-url>
+```
+
+Also, modify ```pulsar.broker-service-url``` and  ```pulsar.zookeeper-uri``` 
configs in ```${project.root}/conf/presto/catalog/pulsar.properties``` on those 
nodes accordingly
+
+Afterwards, you can start the coordinator by just running
+
+```$ ./bin/pulsar sql-worker run```
+
+For the other two nodes that will only serve as worker nodes, the 
configurations can be the following:
+
+```properties
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery.uri=<coordinator-url>
+
+```
+
+Also, modify ```pulsar.broker-service-url``` and  ```pulsar.zookeeper-uri``` 
configs in ```${project.root}/conf/presto/catalog/pulsar.properties``` 
accordingly
+
+You can also start the worker by just running:
+
+```$ ./bin/pulsar sql-worker run```
+
+You can check the status of your cluster from the SQL CLI.  To start the SQL 
CLI:
+
+```bash
+$ ./bin/pulsar sql --server <coordinate_url>
+
+```
+
+You can then run the following command to check the status of your nodes:
+
+```bash
+presto> SELECT * FROM system.runtime.nodes;
+ node_id |        http_uri         | node_version | coordinator | state  
+---------+-------------------------+--------------+-------------+--------
+ 1       | http://192.168.2.1:8081 | testversion  | true        | active 
+ 3       | http://192.168.2.2:8081 | testversion  | false       | active 
+ 2       | http://192.168.2.3:8081 | testversion  | false       | active 
+```
+
+
+For more information about deployment in Presto, please reference:
+
+[Deploying 
Presto](https://prestodb.io/docs/current/installation/deployment.html)
+
diff --git a/site2/docs/sql-getting-started.md 
b/site2/docs/sql-getting-started.md
new file mode 100644
index 0000000000..8aa06cd28f
--- /dev/null
+++ b/site2/docs/sql-getting-started.md
@@ -0,0 +1,142 @@
+---
+id: sql-getting-started
+title: Pulsar SQL Getting Started
+sidebar_label: Getting Started
+---
+
+It is super easy to get started on querying data in Pulsar.  
+
+## Requirements
+1. **Pulsar distribution**
+    * If you haven't install Pulsar, please reference [Installing 
Pulsar](io-quickstart.md#installing-pulsar)
+2. **Pulsar built-in connectors**
+    * If you haven't installed the built-in connectors, please reference 
[Installing Builtin Connectors](io-quickstart.md#installing-builtin-connectors)
+
+First, start a Pulsar standalone cluster:
+
+```bash
+./bin/pulsar standalone
+```
+
+Next, start a Pulsar SQL worker:
+```bash
+./bin/pulsar sql-worker run
+```
+
+After both the Pulsar standalone cluster and the SQL worker are done 
initializing, run the SQL CLI:
+```bash
+./bin/pulsar sql
+```
+
+You can now start typing some SQL commands:
+
+
+```bash
+presto> show catalogs;
+ Catalog 
+---------
+ pulsar  
+ system  
+(2 rows)
+
+Query 20180829_211752_00004_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+
+presto> show schemas in pulsar;
+        Schema         
+-----------------------
+ information_schema    
+ public/default        
+ public/functions      
+ sample/standalone/ns1 
+(4 rows)
+
+Query 20180829_211818_00005_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [4 rows, 89B] [21 rows/s, 471B/s]
+
+
+presto> show tables in pulsar."public/default";
+ Table 
+-------
+(0 rows)
+
+Query 20180829_211839_00006_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+```
+
+Currently, there is no data in Pulsar that we can query.  Lets start the 
built-in connector _DataGeneratorSource_ to ingest some mock data for us to 
query:
+
+```bash
+./bin/pulsar-admin source create --tenant test-tenant --namespace 
test-namespace --name generator --destinationTopicName generator_test 
--source-type data-generator
+```
+
+Afterwards, the will be a topic with can query in the namespace 
"public/default":
+
+```bash
+presto> show tables in pulsar."public/default";
+     Table      
+----------------
+ generator_test 
+(1 row)
+
+Query 20180829_213202_00000_csyeu, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:02 [1 rows, 38B] [0 rows/s, 17B/s]
+```
+
+We can now query the data within the topic "generator_test":
+
+```bash
+presto> select * from pulsar."public/default".generator_test;
+
+  firstname  | middlename  |  lastname   |              email               |  
 username   | password | telephonenumber | age |                 companyemail   
               | nationalidentitycardnumber | 
+-------------+-------------+-------------+----------------------------------+--------------+----------+-----------------+-----+-----------------------------------------------+----------------------------+
+ Genesis     | Katherine   | Wiley       | genesis.wi...@gmail.com          | 
genesisw     | y9D2dtU3 | 959-197-1860    |  71 | 
genesis.wi...@interdemconsulting.eu           | 880-58-9247                |   
+ Brayden     |             | Stanton     | brayden.stan...@yahoo.com        | 
braydens     | ZnjmhXik | 220-027-867     |  81 | brayden.stan...@supermemo.eu  
                | 604-60-7069                |   
+ Benjamin    | Julian      | Velasquez   | benjamin.velasq...@yahoo.com     | 
benjaminv    | 8Bc7m3eb | 298-377-0062    |  21 | 
benjamin.velasq...@hostesltd.biz              | 213-32-5882                |   
+ Michael     | Thomas      | Donovan     | dono...@mail.com                 | 
michaeld     | OqBm9MLs | 078-134-4685    |  55 | michael.dono...@memortech.eu  
                | 443-30-3442                |   
+ Brooklyn    | Avery       | Roach       | brooklynro...@yahoo.com          | 
broach       | IxtBLafO | 387-786-2998    |  68 | brooklyn.ro...@warst.biz      
                | 085-88-3973                |   
+ Skylar      |             | Bradshaw    | skylarbrads...@yahoo.com         | 
skylarb      | p6eC6cKy | 210-872-608     |  96 | skylar.brads...@flyhigh.eu    
                | 453-46-0334                |    
+.
+.
+.
+```
+
+Now, you have some mock data to query and play around with!
+
+If you want to try to ingest some of your own data to play around with, you 
can write a simple producer to write custom defined data to Pulsar.
+
+For example:
+
+```java
+public class Test {
+    
+     public static class Foo {
+        private int field1 = 1;
+        private String field2;
+        private long field3;
+     }
+    
+     public static void main(String[] args) throws Exception {
+        PulsarClient pulsarClient = 
PulsarClient.builder().serviceUrl("pulsar://localhost:6650").build();
+        Producer<Foo> producer = 
pulsarClient.newProducer(AvroSchema.of(Foo.class)).topic("test_topic").create();
+        
+        for (int i = 0; i < 1000; i++) {
+            Foo foo = new Foo();
+            foo.setField1(i);
+            foo.setField2("foo" + i);
+            foo.setField3(System.currentTimeMillis());
+            producer.newMessage().value(foo).send();
+        }
+        producer.close();
+        pulsarClient.close();
+     }
+}
+```
+
+Afterwards, you should be able query the data you just wrote.
diff --git a/site2/docs/sql-overview.md b/site2/docs/sql-overview.md
new file mode 100644
index 0000000000..1df9533e87
--- /dev/null
+++ b/site2/docs/sql-overview.md
@@ -0,0 +1,24 @@
+---
+id: sql-overview
+title: Pulsar SQL Overview
+sidebar_label: Overview
+---
+
+One of the common use cases of Pulsar is storing streams of event data. Often 
the event data is structured which predefined fields.  There is tremendous 
value for users to be able to query the existing data that is already stored in 
Pulsar topics.  With the implementation of the [Schema 
Registry](concepts-schema-registry.md), structured data can be stored in Pulsar 
and allows for the potential to query that data via SQL language.
+
+By leveraging [Presto](https://prestodb.io/), we have created a method for 
users to be able to query structured data stored within Pulsar in a very 
efficient and scalable manner. We will discuss why this very efficient and 
scalable in the [Performance](#performance) section below. 
+
+At the core of this Pulsar SQL is the Presto Pulsar connector which allows 
Presto workers within a Presto cluster to query data from Pulsar.
+
+
+![The Pulsar consumer and reader interfaces](assets/pulsar-sql-arch-2.png)
+
+
+## Performance
+
+The reason why query performance is very efficient and highly scalable because 
of Pulsar's [two level segment based 
architecture](concepts-architecture-overview.md#apache-bookkeeper). 
+
+Topics in Pulsar are stored as segments in [Apache 
Bookkeeper](https://bookkeeper.apache.org/). Each topic segment is also 
replicated to a configurable (default 3) number of Bookkeeper nodes which 
allows for concurrent reads and high read throughput. In the Presto Pulsar 
connector, we read data directly from Bookkeeper to take advantage of the 
Pulsar's segment based architecture.  Thus, Presto workers can read 
concurrently from horizontally scalable number bookkeeper nodes.
+
+
+![The Pulsar consumer and reader interfaces](assets/pulsar-sql-arch-1.png)
diff --git a/site2/website/sidebars.json b/site2/website/sidebars.json
index f321d30b43..5afda90a11 100644
--- a/site2/website/sidebars.json
+++ b/site2/website/sidebars.json
@@ -34,6 +34,11 @@
       "io-connectors",
       "io-develop"
     ],
+    "Pulsar SQL": [
+      "sql-overview",
+      "sql-getting-started",
+      "sql-deployment-configurations"
+    ],
     "Deployment": [
       "deploy-aws",
       "deploy-kubernetes",
diff --git 
a/site2/website/versioned_docs/version-2.2.0/sql-deployment-configurations.md 
b/site2/website/versioned_docs/version-2.2.0/sql-deployment-configurations.md
new file mode 100644
index 0000000000..be035ec6db
--- /dev/null
+++ 
b/site2/website/versioned_docs/version-2.2.0/sql-deployment-configurations.md
@@ -0,0 +1,153 @@
+---
+id: version-2.2.0-sql-deployment-configurations
+title: Pulsar SQl Deployment and Configuration
+sidebar_label: Deployment and Configuration
+original_id: sql-deployment-configurations
+---
+
+Below is a list configurations for the Presto Pulsar connector and instruction 
on how to deploy a cluster.
+
+## Presto Pulsar Connector Configurations
+There are several configurations for the Presto Pulsar Connector.  The 
properties file that contain these configurations can be found at 
```${project.root}/conf/presto/catalog/pulsar.properties```.
+The configurations for the connector and its default values are discribed 
below.
+
+```properties
+# name of the connector to be displayed in the catalog
+connector.name=pulsar
+
+# the url of Pulsar broker service
+pulsar.broker-service-url=http://localhost:8080
+
+# URI of Zookeeper cluster
+pulsar.zookeeper-uri=localhost:2181
+
+# minimum number of entries to read at a single time
+pulsar.entry-read-batch-size=100
+
+# default number of splits to use per query
+pulsar.target-num-splits=4
+```
+
+## Query Pulsar from Existing Presto Cluster
+
+If you already have an existing Presto cluster, you can copy Presto Pulsar 
connector plugin to your existing cluster.  You can download the archived 
plugin package via:
+
+```bash
+$ wget pulsar:binary_release_url
+```
+
+## Deploying a new cluster
+
+Please note that the [Getting Started](sql-getting-started.md) guide shows you 
how to easily setup a standalone single node enviroment to experiment with.
+
+Pulsar SQL is powered by [Presto](https://prestodb.io) thus many of the 
configurations for deployment is the same for the Pulsar SQL worker.
+
+You can use the same CLI args as the Presto launcher:
+
+```bash
+$ ./bin/pulsar sql-worker --help
+Usage: launcher [options] command
+
+Commands: run, start, stop, restart, kill, status
+
+Options:
+  -h, --help            show this help message and exit
+  -v, --verbose         Run verbosely
+  --etc-dir=DIR         Defaults to INSTALL_PATH/etc
+  --launcher-config=FILE
+                        Defaults to INSTALL_PATH/bin/launcher.properties
+  --node-config=FILE    Defaults to ETC_DIR/node.properties
+  --jvm-config=FILE     Defaults to ETC_DIR/jvm.config
+  --config=FILE         Defaults to ETC_DIR/config.properties
+  --log-levels-file=FILE
+                        Defaults to ETC_DIR/log.properties
+  --data-dir=DIR        Defaults to INSTALL_PATH
+  --pid-file=FILE       Defaults to DATA_DIR/var/run/launcher.pid
+  --launcher-log-file=FILE
+                        Defaults to DATA_DIR/var/log/launcher.log (only in
+                        daemon mode)
+  --server-log-file=FILE
+                        Defaults to DATA_DIR/var/log/server.log (only in
+                        daemon mode)
+  -D NAME=VALUE         Set a Java system property
+
+```
+
+There is a set of default configs for the cluster located in 
```${project.root}/conf/presto``` that will be used by default.  You can change 
them to customize your deployment
+
+You can also set the worker to read from a different configuration directory 
as well as set a different directory for writing its data:
+
+```bash
+$ ./bin/pulsar sql-worker run --etc-dir /tmp/incubator-pulsar/conf/presto 
--data-dir /tmp/presto-1
+```
+
+You can also start the worker as daemon process:
+
+```bash
+$ ./bin sql-worker start
+```
+
+### Deploying to a 3 node cluster
+
+For example, if I wanted to deploy a Pulsar SQL/Presto cluster on 3 nodes, you 
can do the following:
+
+First, copy the Pulsar binary distribution to all three nodes.
+
+The first node, will run the Presto coordinator.  The mininal configuration in 
```${project.root}/conf/presto/config.properties``` can be the following
+
+```properties
+coordinator=true
+node-scheduler.include-coordinator=true
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery-server.enabled=true
+discovery.uri=<coordinator-url>
+```
+
+Also, modify ```pulsar.broker-service-url``` and  ```pulsar.zookeeper-uri``` 
configs in ```${project.root}/conf/presto/catalog/pulsar.properties``` on those 
nodes accordingly
+
+Afterwards, you can start the coordinator by just running
+
+```$ ./bin/pulsar sql-worker run```
+
+For the other two nodes that will only serve as worker nodes, the 
configurations can be the following:
+
+```properties
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery.uri=<coordinator-url>
+
+```
+
+Also, modify ```pulsar.broker-service-url``` and  ```pulsar.zookeeper-uri``` 
configs in ```${project.root}/conf/presto/catalog/pulsar.properties``` 
accordingly
+
+You can also start the worker by just running:
+
+```$ ./bin/pulsar sql-worker run```
+
+You can check the status of your cluster from the SQL CLI.  To start the SQL 
CLI:
+
+```bash
+$ ./bin/pulsar sql --server <coordinate_url>
+
+```
+
+You can then run the following command to check the status of your nodes:
+
+```bash
+presto> SELECT * FROM system.runtime.nodes;
+ node_id |        http_uri         | node_version | coordinator | state  
+---------+-------------------------+--------------+-------------+--------
+ 1       | http://192.168.2.1:8081 | testversion  | true        | active 
+ 3       | http://192.168.2.2:8081 | testversion  | false       | active 
+ 2       | http://192.168.2.3:8081 | testversion  | false       | active 
+```
+
+
+For more information about deployment in Presto, please reference:
+
+[Deploying 
Presto](https://prestodb.io/docs/current/installation/deployment.html)
+
diff --git a/site2/website/versioned_docs/version-2.2.0/sql-getting-started.md 
b/site2/website/versioned_docs/version-2.2.0/sql-getting-started.md
new file mode 100644
index 0000000000..135fd734ba
--- /dev/null
+++ b/site2/website/versioned_docs/version-2.2.0/sql-getting-started.md
@@ -0,0 +1,143 @@
+---
+id: version-2.2.0-sql-getting-started
+title: Pulsar SQL Getting Started
+sidebar_label: Getting Started
+original_id: sql-getting-started
+---
+
+It is super easy to get started on querying data in Pulsar.  
+
+## Requirements
+1. **Pulsar distribution**
+    * If you haven't install Pulsar, please reference [Installing 
Pulsar](io-quickstart.md#installing-pulsar)
+2. **Pulsar built-in connectors**
+    * If you haven't installed the built-in connectors, please reference 
[Installing Builtin Connectors](io-quickstart.md#installing-builtin-connectors)
+
+First, start a Pulsar standalone cluster:
+
+```bash
+./bin/pulsar standalone
+```
+
+Next, start a Pulsar SQL worker:
+```bash
+./bin/pulsar sql-worker run
+```
+
+After both the Pulsar standalone cluster and the SQL worker are done 
initializing, run the SQL CLI:
+```bash
+./bin/pulsar sql
+```
+
+You can now start typing some SQL commands:
+
+
+```bash
+presto> show catalogs;
+ Catalog 
+---------
+ pulsar  
+ system  
+(2 rows)
+
+Query 20180829_211752_00004_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+
+presto> show schemas in pulsar;
+        Schema         
+-----------------------
+ information_schema    
+ public/default        
+ public/functions      
+ sample/standalone/ns1 
+(4 rows)
+
+Query 20180829_211818_00005_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [4 rows, 89B] [21 rows/s, 471B/s]
+
+
+presto> show tables in pulsar."public/default";
+ Table 
+-------
+(0 rows)
+
+Query 20180829_211839_00006_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+```
+
+Currently, there is no data in Pulsar that we can query.  Lets start the 
built-in connector _DataGeneratorSource_ to ingest some mock data for us to 
query:
+
+```bash
+./bin/pulsar-admin source create --tenant test-tenant --namespace 
test-namespace --name generator --destinationTopicName generator_test 
--source-type data-generator
+```
+
+Afterwards, the will be a topic with can query in the namespace 
"public/default":
+
+```bash
+presto> show tables in pulsar."public/default";
+     Table      
+----------------
+ generator_test 
+(1 row)
+
+Query 20180829_213202_00000_csyeu, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:02 [1 rows, 38B] [0 rows/s, 17B/s]
+```
+
+We can now query the data within the topic "generator_test":
+
+```bash
+presto> select * from pulsar."public/default".generator_test;
+
+  firstname  | middlename  |  lastname   |              email               |  
 username   | password | telephonenumber | age |                 companyemail   
               | nationalidentitycardnumber | 
+-------------+-------------+-------------+----------------------------------+--------------+----------+-----------------+-----+-----------------------------------------------+----------------------------+
+ Genesis     | Katherine   | Wiley       | genesis.wi...@gmail.com          | 
genesisw     | y9D2dtU3 | 959-197-1860    |  71 | 
genesis.wi...@interdemconsulting.eu           | 880-58-9247                |   
+ Brayden     |             | Stanton     | brayden.stan...@yahoo.com        | 
braydens     | ZnjmhXik | 220-027-867     |  81 | brayden.stan...@supermemo.eu  
                | 604-60-7069                |   
+ Benjamin    | Julian      | Velasquez   | benjamin.velasq...@yahoo.com     | 
benjaminv    | 8Bc7m3eb | 298-377-0062    |  21 | 
benjamin.velasq...@hostesltd.biz              | 213-32-5882                |   
+ Michael     | Thomas      | Donovan     | dono...@mail.com                 | 
michaeld     | OqBm9MLs | 078-134-4685    |  55 | michael.dono...@memortech.eu  
                | 443-30-3442                |   
+ Brooklyn    | Avery       | Roach       | brooklynro...@yahoo.com          | 
broach       | IxtBLafO | 387-786-2998    |  68 | brooklyn.ro...@warst.biz      
                | 085-88-3973                |   
+ Skylar      |             | Bradshaw    | skylarbrads...@yahoo.com         | 
skylarb      | p6eC6cKy | 210-872-608     |  96 | skylar.brads...@flyhigh.eu    
                | 453-46-0334                |    
+.
+.
+.
+```
+
+Now, you have some mock data to query and play around with!
+
+If you want to try to ingest some of your own data to play around with, you 
can write a simple producer to write custom defined data to Pulsar.
+
+For example:
+
+```java
+public class Test {
+    
+     public static class Foo {
+        private int field1 = 1;
+        private String field2;
+        private long field3;
+     }
+    
+     public static void main(String[] args) throws Exception {
+        PulsarClient pulsarClient = 
PulsarClient.builder().serviceUrl("pulsar://localhost:6650").build();
+        Producer<Foo> producer = 
pulsarClient.newProducer(AvroSchema.of(Foo.class)).topic("test_topic").create();
+        
+        for (int i = 0; i < 1000; i++) {
+            Foo foo = new Foo();
+            foo.setField1(i);
+            foo.setField2("foo" + i);
+            foo.setField3(System.currentTimeMillis());
+            producer.newMessage().value(foo).send();
+        }
+        producer.close();
+        pulsarClient.close();
+     }
+}
+```
+
+Afterwards, you should be able query the data you just wrote.
diff --git a/site2/website/versioned_docs/version-2.2.0/sql-overview.md 
b/site2/website/versioned_docs/version-2.2.0/sql-overview.md
new file mode 100644
index 0000000000..ba4594d125
--- /dev/null
+++ b/site2/website/versioned_docs/version-2.2.0/sql-overview.md
@@ -0,0 +1,25 @@
+---
+id: version-2.2.0-sql-overview
+title: Pulsar SQL Overview
+sidebar_label: Overview
+original_id: sql-overview
+---
+
+One of the common use cases of Pulsar is storing streams of event data. Often 
the event data is structured which predefined fields.  There is tremendous 
value for users to be able to query the existing data that is already stored in 
Pulsar topics.  With the implementation of the [Schema 
Registry](concepts-schema-registry.md), structured data can be stored in Pulsar 
and allows for the potential to query that data via SQL language.
+
+By leveraging [Presto](https://prestodb.io/), we have created a method for 
users to be able to query structured data stored within Pulsar in a very 
efficient and scalable manner. We will discuss why this very efficient and 
scalable in the [Performance](#performance) section below. 
+
+At the core of this Pulsar SQL is the Presto Pulsar connector which allows 
Presto workers within a Presto cluster to query data from Pulsar.
+
+
+![The Pulsar consumer and reader interfaces](assets/pulsar-sql-arch-2.png)
+
+
+## Performance
+
+The reason why query performance is very efficient and highly scalable because 
of Pulsar's [two level segment based 
architecture](concepts-architecture-overview.md#apache-bookkeeper). 
+
+Topics in Pulsar are stored as segments in [Apache 
Bookkeeper](https://bookkeeper.apache.org/). Each topic segment is also 
replicated to a configurable (default 3) number of Bookkeeper nodes which 
allows for concurrent reads and high read throughput. In the Presto Pulsar 
connector, we read data directly from Bookkeeper to take advantage of the 
Pulsar's segment based architecture.  Thus, Presto workers can read 
concurrently from horizontally scalable number bookkeeper nodes.
+
+
+![The Pulsar consumer and reader interfaces](assets/pulsar-sql-arch-1.png)
diff --git a/site2/website/versioned_sidebars/version-2.2.0-sidebars.json 
b/site2/website/versioned_sidebars/version-2.2.0-sidebars.json
new file mode 100644
index 0000000000..a18cdc61eb
--- /dev/null
+++ b/site2/website/versioned_sidebars/version-2.2.0-sidebars.json
@@ -0,0 +1,121 @@
+{
+  "version-2.2.0-docs": {
+    "Getting started": [
+      "version-2.2.0-pulsar-2.0",
+      "version-2.2.0-standalone",
+      "version-2.2.0-standalone-docker",
+      "version-2.2.0-client-libraries"
+    ],
+    "Concepts and Architecture": [
+      "version-2.2.0-concepts-overview",
+      "version-2.2.0-concepts-messaging",
+      "version-2.2.0-concepts-architecture-overview",
+      "version-2.2.0-concepts-clients",
+      "version-2.2.0-concepts-replication",
+      "version-2.2.0-concepts-multi-tenancy",
+      "version-2.2.0-concepts-authentication",
+      "version-2.2.0-concepts-topic-compaction",
+      "version-2.2.0-concepts-tiered-storage",
+      "version-2.2.0-concepts-schema-registry"
+    ],
+    "Pulsar Functions": [
+      "version-2.2.0-functions-overview",
+      "version-2.2.0-functions-quickstart",
+      "version-2.2.0-functions-api",
+      "version-2.2.0-functions-deploying",
+      "version-2.2.0-functions-guarantees",
+      "version-2.2.0-functions-state",
+      "version-2.2.0-functions-metrics"
+    ],
+    "Pulsar IO": [
+      "version-2.2.0-io-overview",
+      "version-2.2.0-io-quickstart",
+      "version-2.2.0-io-managing",
+      "version-2.2.0-io-connectors",
+      "version-2.2.0-io-develop"
+    ],
+    "Pulsar SQL": [
+      "version-2.2.0-sql-overview",
+      "version-2.2.0-sql-getting-started",
+      "version-2.2.0-sql-deployment-configurations"
+    ],
+    "Deployment": [
+      "version-2.2.0-deploy-aws",
+      "version-2.2.0-deploy-kubernetes",
+      "version-2.2.0-deploy-bare-metal",
+      "version-2.2.0-deploy-bare-metal-multi-cluster",
+      "version-2.2.0-deploy-dcos",
+      "version-2.2.0-deploy-monitoring"
+    ],
+    "Administration": [
+      "version-2.2.0-administration-zk-bk",
+      "version-2.2.0-administration-geo",
+      "version-2.2.0-administration-dashboard",
+      "version-2.2.0-administration-stats",
+      "version-2.2.0-administration-load-distribution",
+      "version-2.2.0-administration-proxy"
+    ],
+    "Security": [
+      "version-2.2.0-security-overview",
+      "version-2.2.0-security-tls-transport",
+      "version-2.2.0-security-tls-authentication",
+      "version-2.2.0-security-athenz",
+      "version-2.2.0-security-authorization",
+      "version-2.2.0-security-encryption",
+      "version-2.2.0-security-extending"
+    ],
+    "Client libraries": [
+      "version-2.2.0-client-libraries-java",
+      "version-2.2.0-client-libraries-go",
+      "version-2.2.0-client-libraries-python",
+      "version-2.2.0-client-libraries-cpp",
+      "version-2.2.0-client-libraries-websocket"
+    ],
+    "Admin API": [
+      "version-2.2.0-admin-api-overview",
+      "version-2.2.0-admin-api-clusters",
+      "version-2.2.0-admin-api-tenants",
+      "version-2.2.0-admin-api-brokers",
+      "version-2.2.0-admin-api-namespaces",
+      "version-2.2.0-admin-api-permissions",
+      "version-2.2.0-admin-api-persistent-topics",
+      "version-2.2.0-admin-api-non-persistent-topics",
+      "version-2.2.0-admin-api-partitioned-topics",
+      "version-2.2.0-admin-api-schemas"
+    ],
+    "Adaptors": [
+      "version-2.2.0-adaptors-kafka",
+      "version-2.2.0-adaptors-spark",
+      "version-2.2.0-adaptors-storm"
+    ],
+    "Cookbooks": [
+      "version-2.2.0-cookbooks-tiered-storage",
+      "version-2.2.0-cookbooks-compaction",
+      "version-2.2.0-cookbooks-deduplication",
+      "version-2.2.0-cookbooks-non-persistent",
+      "version-2.2.0-cookbooks-partitioned",
+      "version-2.2.0-cookbooks-retention-expiry",
+      "version-2.2.0-cookbooks-encryption",
+      "version-2.2.0-cookbooks-message-queue"
+    ],
+    "Development": [
+      "version-2.2.0-develop-tools",
+      "version-2.2.0-develop-binary-protocol",
+      "version-2.2.0-develop-schema",
+      "version-2.2.0-develop-load-manager",
+      "version-2.2.0-develop-cpp"
+    ],
+    "Reference": [
+      "version-2.2.0-reference-terminology",
+      "version-2.2.0-reference-cli-tools",
+      "version-2.2.0-pulsar-admin",
+      "version-2.2.0-reference-configuration"
+    ]
+  },
+  "version-2.2.0-docs-other": {
+    "First Category": [
+      "version-2.2.0-doc4",
+      "version-2.2.0-doc5"
+    ]
+  }
+}


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] merlimat closed pull request #2849: fixing/adding sql docs to correct locations

Reply via email to