This is an automated email from the ASF dual-hosted git repository.
yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 9831b94961 [doc] Fix Arrow Flight docs (#982)
9831b94961 is described below
commit 9831b949611715e328c718b1d3cfc4210f7402a0
Author: Xinyi Zou <[email protected]>
AuthorDate: Sat Aug 17 18:10:52 2024 +0800
[doc] Fix Arrow Flight docs (#982)
---
...in-apache-doris-for-10x-faster-data-transfer.md | 11 ++++--
blog/release-note-2.1.0.md | 2 +-
common_docs_zh/releasenotes/v2.1/release-2.1.0.md | 2 +-
docs/db-connect/arrow-flight-sql-connect.md | 31 ++++++++++++----
.../current/db-connect/arrow-flight-sql-connect.md | 32 +++++++++++++----
.../db-connect/arrow-flight-sql-connect.md | 42 +++++++++++++++-------
.../db-connect/arrow-flight-sql-connect.md | 32 +++++++++++++----
.../db-connect/arrow-flight-sql-connect.md | 38 +++++++++++++++-----
.../db-connect/arrow-flight-sql-connect.md | 31 ++++++++++++----
9 files changed, 168 insertions(+), 53 deletions(-)
diff --git
a/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.md
b/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.md
index 5e985eed75..be6154f434 100644
--- a/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.md
+++ b/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.md
@@ -82,6 +82,11 @@ Import the following module/library to interact with the
installed library:
```Python
import adbc_driver_manager
import adbc_driver_flightsql.dbapi as flight_sql
+
+>>> print(adbc_driver_manager.__version__)
+1.1.0
+>>> print(adbc_driver_flightsql.__version__)
+1.1.0
```
### 02 Connect to Doris
@@ -97,7 +102,7 @@ Configure parameters for Doris frontend (FE) and backend
(BE):
Suppose that the Arrow Flight SQL services for the Doris instance will run on
ports 9090 and 9091 for FE and BE respectively, and the Doris username/password
is "user" and "pass", the connection process would be:
```C++
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
@@ -246,7 +251,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
# step 2, create a client that interacts with the Doris Arrow Flight SQL
service.
# Modify arrow_flight_sql_port in fe/conf/fe.conf to an available port, such
as 9090.
# Modify arrow_flight_sql_port in be/conf/be.conf to an available port, such
as 9091.
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "root",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "",
})
@@ -401,7 +406,7 @@ import java.sql.ResultSet;
import java.sql.Statement;
Class.forName("org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver");
-String DB_URL = "jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false"
+String DB_URL =
"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false"
+ "&cachePrepStmts=true&useSSL=false&useEncryption=false";
String USER = "root";
String PASS = "";
diff --git a/blog/release-note-2.1.0.md b/blog/release-note-2.1.0.md
index 84af6b2678..7fee163d1e 100644
--- a/blog/release-note-2.1.0.md
+++ b/blog/release-note-2.1.0.md
@@ -165,7 +165,7 @@ Now this is revolutionized in Doris V2.1, where we provide
a high-throughput dat
This allows fast data access to Apache Doris by data science tools like Pandas
and Numpy, which means Apache Doris can be seamlessly integrated with the
entire AI and data science ecosystem. This unveils a future of endless
possibilities.
```C++
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
diff --git a/common_docs_zh/releasenotes/v2.1/release-2.1.0.md
b/common_docs_zh/releasenotes/v2.1/release-2.1.0.md
index 7fc3a26520..f97361fbf0 100644
--- a/common_docs_zh/releasenotes/v2.1/release-2.1.0.md
+++ b/common_docs_zh/releasenotes/v2.1/release-2.1.0.md
@@ -161,7 +161,7 @@ under the License.
基于此,Apache Doris 可以与整个 AI 和数据科学生态进行良好的整合,这也是未来的重要发展方向。
```C++
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
diff --git a/docs/db-connect/arrow-flight-sql-connect.md
b/docs/db-connect/arrow-flight-sql-connect.md
index 8ca606ab65..0c9e326f5f 100644
--- a/docs/db-connect/arrow-flight-sql-connect.md
+++ b/docs/db-connect/arrow-flight-sql-connect.md
@@ -58,6 +58,11 @@ Import the following modules/libraries in the code to use
the installed Library:
```Python
import adbc_driver_manager
import adbc_driver_flightsql.dbapi as flight_sql
+
+>>> print(adbc_driver_manager.__version__)
+1.1.0
+>>> print(adbc_driver_flightsql.__version__)
+1.1.0
```
### Connect to Doris
@@ -73,7 +78,7 @@ Modify the configuration parameters of Doris FE and BE:
Assuming that the Arrow Flight SQL services of FE and BE in the Doris instance
will run on ports 9090 and 9091 respectively, and the Doris username/password
is "user"/"pass", the connection process is as follows:
```Python
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
@@ -222,7 +227,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
# step 2, create a client that interacts with the Doris Arrow Flight SQL
service.
# Modify arrow_flight_sql_port in fe/conf/fe.conf to an available port, such
as 9090.
# Modify arrow_flight_sql_port in be/conf/be.conf to an available port, such
as 9091.
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "root",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "",
})
@@ -301,7 +306,7 @@ import java.sql.ResultSet;
import java.sql.Statement;
Class.forName("org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver");
-String DB_URL = "jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false"
+String DB_URL =
"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false"
+ "&cachePrepStmts=true&useSSL=false&useEncryption=false";
String USER = "root";
String PASS = "";
@@ -366,7 +371,7 @@ The connection code example is as follows:
final BufferAllocator allocator = new RootAllocator();
FlightSqlDriver driver = new FlightSqlDriver(allocator);
Map<String, Object> parameters = new HashMap<>();
-AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("0.0.0.0",
9090).getUri().toString());
+AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("{FE_HOST}",
{fe.conf:arrow_flight_sql_port}).getUri().toString());
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
AdbcDatabase adbcDatabase = driver.open(parameters);
@@ -407,14 +412,18 @@ $ java
--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -
$ env
_JAVA_OPTIONS="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
java -jar ...
```
-Otherwise, you may see errors like `module java.base does not "opens java.nio"
to unnamed module` or `module java.base does not "opens java.nio" to
org.apache.arrow.memory.core`
+Otherwise, you may see some errors such as `module java.base does not "opens
java.nio" to unnamed module` or `module java.base does not "opens java.nio" to
org.apache.arrow.memory.core` or `ava.lang.NoClassDefFoundError: Could not
initialize class org.apache.arrow.memory.util.MemoryUtil (Internal; Prepare)`
+
+If you debug in IntelliJ IDEA, you need to add
`--add-opens=java.base/java.nio=ALL-UNNAMED` in `Build and run` of `Run/Debug
Configurations`, refer to the picture below:
+
+
The connection code example is as follows:
```Java
final Map<String, Object> parameters = new HashMap<>();
AdbcDriver.PARAM_URI.set(
-
parameters,"jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
+
parameters,"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
try (
@@ -479,4 +488,12 @@ The Linux kernel version of kylinv10 SP2 and SP3 is only
up to 4.19.90-24.4.v210
4. ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel
reading, and the `stmt.executePartitioned()` method is not implemented. You can
only use the native FlightClient to implement parallel reading of multiple
Endpoints, using the method `sqlClient=new FlightSqlClient,
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(),
for(FlightEndpoint endpoint: endpoints)`. In addition, the default
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQue [...]
-5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the
database name in the URL. For example,
`jdbc:arrow-flight-sql://0.0.0.0:9090/test?useServerPrepStmts=false` specifies
that the connection to the `test` database is invalid. You can only execute the
SQL `use database` manually.
+5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the
database name in the URL. For example,
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
specifies that the connection to the `test` database is invalid. You can only
execute the SQL `use database` manually.
+
+6. There is a bug in Doris 2.1.4 version. There is a chance of error when
reading large amounts of data. This bug is fixed in [Fix arrow flight result
sink #36827](https://github.com/apache/doris/pull/36827) PR. Upgrading Doris
2.1.5 version can solve this problem. For details of the problem, see:
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+
+7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant`
Ignore this warning when using Python. This is a problem with the Python ADBC
Client and will not affect the query.
+
+8. Python reports an error `grpc: received message larger than max (20748753
vs. 16777216)`. Refer to [Python: grpc: received message larger than max
(20748753 vs. 16777216)
#2078](https://github.com/apache/arrow-adbc/issues/2078) to add
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database
Option.
+
+9. Before Doris version 2.1.7, the error `Reach limit of connections` is
reported. This is because there is no limit on the number of Arrow Flight
connections for a single user, which is less than `max_user_connections` in
`UserProperty`, which is 100 by default. You can modify the current maximum
number of connections for Billie user to 100 by `SET PROPERTY FOR 'Billie'
'max_user_connections' = '1000';`, or add `arrow_flight_token_cache_size=50` in
`fe.conf` to limit the overall number [...]
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/db-connect/arrow-flight-sql-connect.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/db-connect/arrow-flight-sql-connect.md
index 0615f7760c..febbb31207 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/db-connect/arrow-flight-sql-connect.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/db-connect/arrow-flight-sql-connect.md
@@ -59,6 +59,11 @@ pip install adbc_driver_flightsql
```Python
import adbc_driver_manager
import adbc_driver_flightsql.dbapi as flight_sql
+
+>>> print(adbc_driver_manager.__version__)
+1.1.0
+>>> print(adbc_driver_flightsql.__version__)
+1.1.0
```
### 连接 Doris
@@ -74,7 +79,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
假设 Doris 实例中 FE 和 BE 的 Arrow Flight SQL 服务将分别在端口 9090 和 9091 上运行,且 Doris
用户名/密码为“user”/“pass”,那么连接过程如下所示:
```Python
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
@@ -223,7 +228,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
# step 2, create a client that interacts with the Doris Arrow Flight SQL
service.
# Modify arrow_flight_sql_port in fe/conf/fe.conf to an available port, such
as 9090.
# Modify arrow_flight_sql_port in be/conf/be.conf to an available port, such
as 9091.
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "root",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "",
})
@@ -301,7 +306,12 @@ $ java
--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -
# Indirectly via environment variables
$ env
_JAVA_OPTIONS="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
java -jar ...
```
-否则,您可能会看到一些错误,如 `module java.base does not "opens java.nio" to unnamed module`
或者 `module java.base does not "opens java.nio" to org.apache.arrow.memory.core`
+
+否则,您可能会看到一些错误,如 `module java.base does not "opens java.nio" to unnamed module`
或者 `module java.base does not "opens java.nio" to org.apache.arrow.memory.core`
或者 `ava.lang.NoClassDefFoundError: Could not initialize class
org.apache.arrow.memory.util.MemoryUtil (Internal; Prepare)`
+
+如果您在 IntelliJ IDEA 中调试,需要在 `Run/Debug Configurations` 的 `Build and run` 中增加
`--add-opens=java.base/java.nio=ALL-UNNAMED`,参照下面的图片:
+
+
连接代码示例如下:
@@ -312,7 +322,7 @@ import java.sql.ResultSet;
import java.sql.Statement;
Class.forName("org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver");
-String DB_URL = "jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false"
+String DB_URL =
"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false"
+ "&cachePrepStmts=true&useSSL=false&useEncryption=false";
String USER = "root";
String PASS = "";
@@ -377,7 +387,7 @@ POM dependency:
final BufferAllocator allocator = new RootAllocator();
FlightSqlDriver driver = new FlightSqlDriver(allocator);
Map<String, Object> parameters = new HashMap<>();
-AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("0.0.0.0",
9090).getUri().toString());
+AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("{FE_HOST}",
{fe.conf:arrow_flight_sql_port}).getUri().toString());
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
AdbcDatabase adbcDatabase = driver.open(parameters);
@@ -414,7 +424,7 @@ connection.close();
```Java
final Map<String, Object> parameters = new HashMap<>();
AdbcDriver.PARAM_URI.set(
-
parameters,"jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
+
parameters,"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
try (
@@ -479,4 +489,12 @@ kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有
4.19.90-24.4.v2101.ky10.
4. ADBC v0.10,JDBC 和 Java ADBC/JDBCDriver
还不支持并行读取,没有实现`stmt.executePartitioned()`这个方法,只能使用原生的 FlightClient 实现并行读取多个
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql),
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint:
endpoints)`,此外,ADBC V0.10 默认的AdbcStatement实际是JdbcStatement,executeQuery后将行存格式的
JDBC ResultSet 又重新转成的Arrow列存格式,预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 [GitHub
Issue](https://github.com/apache/arrow-adbc/issues/1490)。
-5. 截止Arrow v15.0,Arrow JDBC Connector 不支持在 URL 中指定 database name,比如
`jdbc:arrow-flight-sql://0.0.0.0:9090/test?useServerPrepStmts=false`
中指定连接`test` database无效,只能手动执行SQL `use database`。
+5. 截止Arrow v15.0,Arrow JDBC Connector 不支持在 URL 中指定 database name,比如
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
中指定连接`test` database无效,只能手动执行SQL `use database`。
+
+6. Doris 2.1.4 version 存在一个Bug,读取大数据量时有几率报错,在 [Fix arrow flight result sink
#36827](https://github.com/apache/doris/pull/36827) 修复,升级 Doris 2.1.5 version
可以解决。问题详情见:[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+
+7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant`
使用 Python 时忽略这个 Warning,这是 Python ADBC Client 的问题,这不会影响查询。
+
+8. Python 报错 `grpc: received message larger than max (20748753 vs.
16777216)`,参考 [Python: grpc: received message larger than max (20748753 vs.
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+
+9. Doris version 2.1.7 版本之前,报错 `Reach limit of connections`,这是因为没有限制单个用户的
Arrow Flight 连接数小于 `UserProperty` 中的 `max_user_connections`,默认100,可以通过 `SET
PROPERTY FOR 'Billie' 'max_user_connections' = '1000';` 修改 Billie 用户的当前最大连接数到
100,或者在 `fe.conf` 中增加 `arrow_flight_token_cache_size=50` 来限制整体的 Arrow Flight
连接数。Doris version 2.1.7 版本之前 Arrow Flight 连接默认 3天 超时断开,只强制连接数小于
`qe_max_connection/2`,超过时依据lru淘汰,`qe_max_connection` 是fe所有用户的总连接数,默认1024。具体可以看
`arrow_flight_token_cache_size` 这个conf的介绍。在 [...]
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/db-connect/arrow-flight-sql-connect.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/db-connect/arrow-flight-sql-connect.md
index 5949d35b0c..e1091e9519 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/db-connect/arrow-flight-sql-connect.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/db-connect/arrow-flight-sql-connect.md
@@ -36,10 +36,10 @@ Doris 基于 Arrow Flight SQL 协议实现了高速数据链路,支持多种
Apache Arrow Flight SQL 是一个由 Apache Arrow 社区开发的与数据库系统交互的协议,用于 ADBC 客户端使用 Arrow
数据格式与实现了 Arrow Flight SQL 协议的数据库交互,具有 Arrow Flight 的速度优势以及 JDBC/ODBC 的易用性。
-Doris 支持 Arrow Flight SQL 的动机、设计与实现、性能测试结果、以及有关 Arrow Flight、ADBC
的更多概念可以看:[GitHub Issue](https://github.com/apache/doris/issues/25514),这篇文档主要介绍
Doris Arrow Flight SQL 的使用方法,以及一些常见问题。
+Doris 支持 Arrow Flight SQL 的动机、设计与实现、性能测试结果、以及有关 Arrow Flight、ADBC 的更多概念可以看
[GitHub Issue](https://github.com/apache/doris/issues/25514),这篇文档主要介绍 Doris
Arrow Flight SQL 的使用方法,以及一些常见问题。
-安装Apache Arrow 你可以去官方文档(
-[Apache Arrow](https://arrow.apache.org/install/))找到详细的安装教程。
+安装 Apache Arrow 你可以去官方文档(
+[Apache Arrow](https://arrow.apache.org/install/))找到详细的安装教程。
## Python 使用方法
@@ -59,6 +59,11 @@ pip install adbc_driver_flightsql
```Python
import adbc_driver_manager
import adbc_driver_flightsql.dbapi as flight_sql
+
+>>> print(adbc_driver_manager.__version__)
+1.1.0
+>>> print(adbc_driver_flightsql.__version__)
+1.1.0
```
### 连接 Doris
@@ -74,7 +79,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
假设 Doris 实例中 FE 和 BE 的 Arrow Flight SQL 服务将分别在端口 9090 和 9091 上运行,且 Doris
用户名/密码为“user”/“pass”,那么连接过程如下所示:
```Python
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
@@ -223,7 +228,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
# step 2, create a client that interacts with the Doris Arrow Flight SQL
service.
# Modify arrow_flight_sql_port in fe/conf/fe.conf to an available port, such
as 9090.
# Modify arrow_flight_sql_port in be/conf/be.conf to an available port, such
as 9091.
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "root",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "",
})
@@ -275,9 +280,9 @@ execute("select k5, sum(k1), count(1), avg(k3) from
arrow_flight_sql_test group
cursor.close()
```
-## Jdbc Connector with Arrow Flight SQL
+## JDBC Connector with Arrow Flight SQL
-Arrow Flight SQL 协议的开源 JDBC 驱动兼容标准的 JDBC API,可用于大多数 BI 工具通过 JDBC 访问
Doris,并支持高速传输 Apache Arrow 数据。使用方法与通过 MySQL 协议的 JDBC 驱动连接 Doris 类似,只需将链接 URL 中的
jdbc:mysql 协议换成 jdbc:arrow-flight-sql协议,查询返回的结果依然是 JDBC 的 ResultSet 数据结构。
+Arrow Flight SQL 协议的开源 JDBC 驱动兼容标准的 JDBC API,可用于大多数 BI 工具通过 JDBC 访问
Doris,并支持高速传输 Apache Arrow 数据。使用方法与通过 MySQL 协议的 JDBC 驱动连接 Doris 类似,只需将链接 URL 中的
jdbc:mysql 协议换成 jdbc:arrow-flight-sql 协议,查询返回的结果依然是 JDBC 的 ResultSet 数据结构。
POM dependency:
```Java
@@ -301,7 +306,12 @@ $ java
--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -
# Indirectly via environment variables
$ env
_JAVA_OPTIONS="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
java -jar ...
```
-否则,您可能会看到一些错误,如 `module java.base does not "opens java.nio" to unnamed module`
或者 `module java.base does not "opens java.nio" to org.apache.arrow.memory.core`
+
+否则,您可能会看到一些错误,如 `module java.base does not "opens java.nio" to unnamed module`
或者 `module java.base does not "opens java.nio" to org.apache.arrow.memory.core`
或者 `ava.lang.NoClassDefFoundError: Could not initialize class
org.apache.arrow.memory.util.MemoryUtil (Internal; Prepare)`
+
+如果您在 IntelliJ IDEA 中调试,需要在 `Run/Debug Configurations` 的 `Build and run` 中增加
`--add-opens=java.base/java.nio=ALL-UNNAMED`,参照下面的图片:
+
+
连接代码示例如下:
@@ -312,7 +322,7 @@ import java.sql.ResultSet;
import java.sql.Statement;
Class.forName("org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver");
-String DB_URL = "jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false"
+String DB_URL =
"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false"
+ "&cachePrepStmts=true&useSSL=false&useEncryption=false";
String USER = "root";
String PASS = "";
@@ -377,7 +387,7 @@ POM dependency:
final BufferAllocator allocator = new RootAllocator();
FlightSqlDriver driver = new FlightSqlDriver(allocator);
Map<String, Object> parameters = new HashMap<>();
-AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("0.0.0.0",
9090).getUri().toString());
+AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("{FE_HOST}",
{fe.conf:arrow_flight_sql_port}).getUri().toString());
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
AdbcDatabase adbcDatabase = driver.open(parameters);
@@ -414,7 +424,7 @@ connection.close();
```Java
final Map<String, Object> parameters = new HashMap<>();
AdbcDriver.PARAM_URI.set(
-
parameters,"jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
+
parameters,"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
try (
@@ -479,4 +489,12 @@ kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有
4.19.90-24.4.v2101.ky10.
4. ADBC v0.10,JDBC 和 Java ADBC/JDBCDriver
还不支持并行读取,没有实现`stmt.executePartitioned()`这个方法,只能使用原生的 FlightClient 实现并行读取多个
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql),
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint:
endpoints)`,此外,ADBC V0.10 默认的AdbcStatement实际是JdbcStatement,executeQuery后将行存格式的
JDBC ResultSet 又重新转成的Arrow列存格式,预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 [GitHub
Issue](https://github.com/apache/arrow-adbc/issues/1490)。
-5. 截止Arrow v15.0,Arrow JDBC Connector 不支持在 URL 中指定 database name,比如
`jdbc:arrow-flight-sql://0.0.0.0:9090/test?useServerPrepStmts=false`
中指定连接`test` database无效,只能手动执行SQL `use database`。
+5. 截止Arrow v15.0,Arrow JDBC Connector 不支持在 URL 中指定 database name,比如
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
中指定连接`test` database无效,只能手动执行SQL `use database`。
+
+6. Doris 2.1.4 version 存在一个Bug,读取大数据量时有几率报错,在 [Fix arrow flight result sink
#36827](https://github.com/apache/doris/pull/36827) 这个pr修复,升级 Doris 2.1.5
version
可以解决。问题详情见:[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+
+7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant`
使用 Python 时忽略这个 Warning,这是 Python ADBC Client 的问题,这不会影响查询。
+
+8. Python 报错 `grpc: received message larger than max (20748753 vs.
16777216)`,参考 [Python: grpc: received message larger than max (20748753 vs.
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+
+9. Doris version 2.1.7 版本之前,报错 `Reach limit of connections`,这是因为没有限制单个用户的
Arrow Flight 连接数小于 `UserProperty` 中的 `max_user_connections`,默认100,可以通过 `SET
PROPERTY FOR 'Billie' 'max_user_connections' = '1000';` 修改 Billie 用户的当前最大连接数到
100,或者在 `fe.conf` 中增加 `arrow_flight_token_cache_size=50` 来限制整体的 Arrow Flight
连接数。Doris version 2.1.7 版本之前 Arrow Flight 连接默认 3天 超时断开,只强制连接数小于
`qe_max_connection/2`,超过时依据lru淘汰,`qe_max_connection` 是fe所有用户的总连接数,默认1024。具体可以看
`arrow_flight_token_cache_size` 这个conf的介绍。在 [...]
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/db-connect/arrow-flight-sql-connect.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/db-connect/arrow-flight-sql-connect.md
index 0615f7760c..e1091e9519 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/db-connect/arrow-flight-sql-connect.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/db-connect/arrow-flight-sql-connect.md
@@ -59,6 +59,11 @@ pip install adbc_driver_flightsql
```Python
import adbc_driver_manager
import adbc_driver_flightsql.dbapi as flight_sql
+
+>>> print(adbc_driver_manager.__version__)
+1.1.0
+>>> print(adbc_driver_flightsql.__version__)
+1.1.0
```
### 连接 Doris
@@ -74,7 +79,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
假设 Doris 实例中 FE 和 BE 的 Arrow Flight SQL 服务将分别在端口 9090 和 9091 上运行,且 Doris
用户名/密码为“user”/“pass”,那么连接过程如下所示:
```Python
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
@@ -223,7 +228,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
# step 2, create a client that interacts with the Doris Arrow Flight SQL
service.
# Modify arrow_flight_sql_port in fe/conf/fe.conf to an available port, such
as 9090.
# Modify arrow_flight_sql_port in be/conf/be.conf to an available port, such
as 9091.
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "root",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "",
})
@@ -301,7 +306,12 @@ $ java
--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -
# Indirectly via environment variables
$ env
_JAVA_OPTIONS="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
java -jar ...
```
-否则,您可能会看到一些错误,如 `module java.base does not "opens java.nio" to unnamed module`
或者 `module java.base does not "opens java.nio" to org.apache.arrow.memory.core`
+
+否则,您可能会看到一些错误,如 `module java.base does not "opens java.nio" to unnamed module`
或者 `module java.base does not "opens java.nio" to org.apache.arrow.memory.core`
或者 `ava.lang.NoClassDefFoundError: Could not initialize class
org.apache.arrow.memory.util.MemoryUtil (Internal; Prepare)`
+
+如果您在 IntelliJ IDEA 中调试,需要在 `Run/Debug Configurations` 的 `Build and run` 中增加
`--add-opens=java.base/java.nio=ALL-UNNAMED`,参照下面的图片:
+
+
连接代码示例如下:
@@ -312,7 +322,7 @@ import java.sql.ResultSet;
import java.sql.Statement;
Class.forName("org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver");
-String DB_URL = "jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false"
+String DB_URL =
"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false"
+ "&cachePrepStmts=true&useSSL=false&useEncryption=false";
String USER = "root";
String PASS = "";
@@ -377,7 +387,7 @@ POM dependency:
final BufferAllocator allocator = new RootAllocator();
FlightSqlDriver driver = new FlightSqlDriver(allocator);
Map<String, Object> parameters = new HashMap<>();
-AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("0.0.0.0",
9090).getUri().toString());
+AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("{FE_HOST}",
{fe.conf:arrow_flight_sql_port}).getUri().toString());
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
AdbcDatabase adbcDatabase = driver.open(parameters);
@@ -414,7 +424,7 @@ connection.close();
```Java
final Map<String, Object> parameters = new HashMap<>();
AdbcDriver.PARAM_URI.set(
-
parameters,"jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
+
parameters,"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
try (
@@ -479,4 +489,12 @@ kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有
4.19.90-24.4.v2101.ky10.
4. ADBC v0.10,JDBC 和 Java ADBC/JDBCDriver
还不支持并行读取,没有实现`stmt.executePartitioned()`这个方法,只能使用原生的 FlightClient 实现并行读取多个
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql),
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint:
endpoints)`,此外,ADBC V0.10 默认的AdbcStatement实际是JdbcStatement,executeQuery后将行存格式的
JDBC ResultSet 又重新转成的Arrow列存格式,预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 [GitHub
Issue](https://github.com/apache/arrow-adbc/issues/1490)。
-5. 截止Arrow v15.0,Arrow JDBC Connector 不支持在 URL 中指定 database name,比如
`jdbc:arrow-flight-sql://0.0.0.0:9090/test?useServerPrepStmts=false`
中指定连接`test` database无效,只能手动执行SQL `use database`。
+5. 截止Arrow v15.0,Arrow JDBC Connector 不支持在 URL 中指定 database name,比如
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
中指定连接`test` database无效,只能手动执行SQL `use database`。
+
+6. Doris 2.1.4 version 存在一个Bug,读取大数据量时有几率报错,在 [Fix arrow flight result sink
#36827](https://github.com/apache/doris/pull/36827) 这个pr修复,升级 Doris 2.1.5
version
可以解决。问题详情见:[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+
+7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant`
使用 Python 时忽略这个 Warning,这是 Python ADBC Client 的问题,这不会影响查询。
+
+8. Python 报错 `grpc: received message larger than max (20748753 vs.
16777216)`,参考 [Python: grpc: received message larger than max (20748753 vs.
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+
+9. Doris version 2.1.7 版本之前,报错 `Reach limit of connections`,这是因为没有限制单个用户的
Arrow Flight 连接数小于 `UserProperty` 中的 `max_user_connections`,默认100,可以通过 `SET
PROPERTY FOR 'Billie' 'max_user_connections' = '1000';` 修改 Billie 用户的当前最大连接数到
100,或者在 `fe.conf` 中增加 `arrow_flight_token_cache_size=50` 来限制整体的 Arrow Flight
连接数。Doris version 2.1.7 版本之前 Arrow Flight 连接默认 3天 超时断开,只强制连接数小于
`qe_max_connection/2`,超过时依据lru淘汰,`qe_max_connection` 是fe所有用户的总连接数,默认1024。具体可以看
`arrow_flight_token_cache_size` 这个conf的介绍。在 [...]
diff --git a/versioned_docs/version-2.1/db-connect/arrow-flight-sql-connect.md
b/versioned_docs/version-2.1/db-connect/arrow-flight-sql-connect.md
index 8f226fc9db..0c9e326f5f 100644
--- a/versioned_docs/version-2.1/db-connect/arrow-flight-sql-connect.md
+++ b/versioned_docs/version-2.1/db-connect/arrow-flight-sql-connect.md
@@ -58,6 +58,11 @@ Import the following modules/libraries in the code to use
the installed Library:
```Python
import adbc_driver_manager
import adbc_driver_flightsql.dbapi as flight_sql
+
+>>> print(adbc_driver_manager.__version__)
+1.1.0
+>>> print(adbc_driver_flightsql.__version__)
+1.1.0
```
### Connect to Doris
@@ -73,7 +78,7 @@ Modify the configuration parameters of Doris FE and BE:
Assuming that the Arrow Flight SQL services of FE and BE in the Doris instance
will run on ports 9090 and 9091 respectively, and the Doris username/password
is "user"/"pass", the connection process is as follows:
```Python
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
@@ -222,7 +227,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
# step 2, create a client that interacts with the Doris Arrow Flight SQL
service.
# Modify arrow_flight_sql_port in fe/conf/fe.conf to an available port, such
as 9090.
# Modify arrow_flight_sql_port in be/conf/be.conf to an available port, such
as 9091.
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "root",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "",
})
@@ -301,7 +306,7 @@ import java.sql.ResultSet;
import java.sql.Statement;
Class.forName("org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver");
-String DB_URL = "jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false"
+String DB_URL =
"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false"
+ "&cachePrepStmts=true&useSSL=false&useEncryption=false";
String USER = "root";
String PASS = "";
@@ -349,6 +354,11 @@ POM dependency:
<artifactId>adbc-sql</artifactId>
<version>${adbc.version}</version>
</dependency>
+ <dependency>
+ <groupId>org.apache.arrow.adbc</groupId>
+ <artifactId>adbc-driver-flight-sql</artifactId>
+ <version>${adbc.version}</version>
+ </dependency>
</dependencies>
```
@@ -361,7 +371,7 @@ The connection code example is as follows:
final BufferAllocator allocator = new RootAllocator();
FlightSqlDriver driver = new FlightSqlDriver(allocator);
Map<String, Object> parameters = new HashMap<>();
-AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("0.0.0.0",
9090).getUri().toString());
+AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("{FE_HOST}",
{fe.conf:arrow_flight_sql_port}).getUri().toString());
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
AdbcDatabase adbcDatabase = driver.open(parameters);
@@ -402,14 +412,18 @@ $ java
--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -
$ env
_JAVA_OPTIONS="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
java -jar ...
```
-Otherwise, you may see errors like `module java.base does not "opens java.nio"
to unnamed module` or `module java.base does not "opens java.nio" to
org.apache.arrow.memory.core`
+Otherwise, you may see some errors such as `module java.base does not "opens
java.nio" to unnamed module` or `module java.base does not "opens java.nio" to
org.apache.arrow.memory.core` or `ava.lang.NoClassDefFoundError: Could not
initialize class org.apache.arrow.memory.util.MemoryUtil (Internal; Prepare)`
+
+If you debug in IntelliJ IDEA, you need to add
`--add-opens=java.base/java.nio=ALL-UNNAMED` in `Build and run` of `Run/Debug
Configurations`, refer to the picture below:
+
+
The connection code example is as follows:
```Java
final Map<String, Object> parameters = new HashMap<>();
AdbcDriver.PARAM_URI.set(
-
parameters,"jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
+
parameters,"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
try (
@@ -438,7 +452,7 @@ try (
}
```
-### Choice of JDBC and Java connection methods
+### Choice of Jdbc and Java connection methods
Compared with the traditional `jdbc:mysql` connection method, the performance
test of the Arrow Flight SQL connection method of Jdbc and Java can be found at
[GitHub Issue](https://github.com/apache/doris/issues/25514). Here are some
usage suggestions based on the test conclusions.
@@ -474,4 +488,12 @@ The Linux kernel version of kylinv10 SP2 and SP3 is only
up to 4.19.90-24.4.v210
4. ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel
reading, and the `stmt.executePartitioned()` method is not implemented. You can
only use the native FlightClient to implement parallel reading of multiple
Endpoints, using the method `sqlClient=new FlightSqlClient,
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(),
for(FlightEndpoint endpoint: endpoints)`. In addition, the default
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQue [...]
-5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the
database name in the URL. For example,
`jdbc:arrow-flight-sql://0.0.0.0:9090/test?useServerPrepStmts=false` specifies
that the connection to the `test` database is invalid. You can only execute the
SQL `use database` manually.
+5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the
database name in the URL. For example,
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
specifies that the connection to the `test` database is invalid. You can only
execute the SQL `use database` manually.
+
+6. There is a bug in Doris 2.1.4 version. There is a chance of error when
reading large amounts of data. This bug is fixed in [Fix arrow flight result
sink #36827](https://github.com/apache/doris/pull/36827) PR. Upgrading Doris
2.1.5 version can solve this problem. For details of the problem, see:
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+
+7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant`
Ignore this warning when using Python. This is a problem with the Python ADBC
Client and will not affect the query.
+
+8. Python reports an error `grpc: received message larger than max (20748753
vs. 16777216)`. Refer to [Python: grpc: received message larger than max
(20748753 vs. 16777216)
#2078](https://github.com/apache/arrow-adbc/issues/2078) to add
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database
Option.
+
+9. Before Doris version 2.1.7, the error `Reach limit of connections` is
reported. This is because there is no limit on the number of Arrow Flight
connections for a single user, which is less than `max_user_connections` in
`UserProperty`, which is 100 by default. You can modify the current maximum
number of connections for Billie user to 100 by `SET PROPERTY FOR 'Billie'
'max_user_connections' = '1000';`, or add `arrow_flight_token_cache_size=50` in
`fe.conf` to limit the overall number [...]
diff --git a/versioned_docs/version-3.0/db-connect/arrow-flight-sql-connect.md
b/versioned_docs/version-3.0/db-connect/arrow-flight-sql-connect.md
index 8ca606ab65..0c9e326f5f 100644
--- a/versioned_docs/version-3.0/db-connect/arrow-flight-sql-connect.md
+++ b/versioned_docs/version-3.0/db-connect/arrow-flight-sql-connect.md
@@ -58,6 +58,11 @@ Import the following modules/libraries in the code to use
the installed Library:
```Python
import adbc_driver_manager
import adbc_driver_flightsql.dbapi as flight_sql
+
+>>> print(adbc_driver_manager.__version__)
+1.1.0
+>>> print(adbc_driver_flightsql.__version__)
+1.1.0
```
### Connect to Doris
@@ -73,7 +78,7 @@ Modify the configuration parameters of Doris FE and BE:
Assuming that the Arrow Flight SQL services of FE and BE in the Doris instance
will run on ports 9090 and 9091 respectively, and the Doris username/password
is "user"/"pass", the connection process is as follows:
```Python
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "user",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "pass",
})
@@ -222,7 +227,7 @@ import adbc_driver_flightsql.dbapi as flight_sql
# step 2, create a client that interacts with the Doris Arrow Flight SQL
service.
# Modify arrow_flight_sql_port in fe/conf/fe.conf to an available port, such
as 9090.
# Modify arrow_flight_sql_port in be/conf/be.conf to an available port, such
as 9091.
-conn = flight_sql.connect(uri="grpc://127.0.0.1:9090", db_kwargs={
+conn =
flight_sql.connect(uri="grpc://{FE_HOST}:{fe.conf:arrow_flight_sql_port}",
db_kwargs={
adbc_driver_manager.DatabaseOptions.USERNAME.value: "root",
adbc_driver_manager.DatabaseOptions.PASSWORD.value: "",
})
@@ -301,7 +306,7 @@ import java.sql.ResultSet;
import java.sql.Statement;
Class.forName("org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver");
-String DB_URL = "jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false"
+String DB_URL =
"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false"
+ "&cachePrepStmts=true&useSSL=false&useEncryption=false";
String USER = "root";
String PASS = "";
@@ -366,7 +371,7 @@ The connection code example is as follows:
final BufferAllocator allocator = new RootAllocator();
FlightSqlDriver driver = new FlightSqlDriver(allocator);
Map<String, Object> parameters = new HashMap<>();
-AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("0.0.0.0",
9090).getUri().toString());
+AdbcDriver.PARAM_URI.set(parameters, Location.forGrpcInsecure("{FE_HOST}",
{fe.conf:arrow_flight_sql_port}).getUri().toString());
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
AdbcDatabase adbcDatabase = driver.open(parameters);
@@ -407,14 +412,18 @@ $ java
--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -
$ env
_JAVA_OPTIONS="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
java -jar ...
```
-Otherwise, you may see errors like `module java.base does not "opens java.nio"
to unnamed module` or `module java.base does not "opens java.nio" to
org.apache.arrow.memory.core`
+Otherwise, you may see some errors such as `module java.base does not "opens
java.nio" to unnamed module` or `module java.base does not "opens java.nio" to
org.apache.arrow.memory.core` or `ava.lang.NoClassDefFoundError: Could not
initialize class org.apache.arrow.memory.util.MemoryUtil (Internal; Prepare)`
+
+If you debug in IntelliJ IDEA, you need to add
`--add-opens=java.base/java.nio=ALL-UNNAMED` in `Build and run` of `Run/Debug
Configurations`, refer to the picture below:
+
+
The connection code example is as follows:
```Java
final Map<String, Object> parameters = new HashMap<>();
AdbcDriver.PARAM_URI.set(
-
parameters,"jdbc:arrow-flight-sql://0.0.0.0:9090?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
+
parameters,"jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}?useServerPrepStmts=false&cachePrepStmts=true&useSSL=false&useEncryption=false");
AdbcDriver.PARAM_USERNAME.set(parameters, "root");
AdbcDriver.PARAM_PASSWORD.set(parameters, "");
try (
@@ -479,4 +488,12 @@ The Linux kernel version of kylinv10 SP2 and SP3 is only
up to 4.19.90-24.4.v210
4. ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel
reading, and the `stmt.executePartitioned()` method is not implemented. You can
only use the native FlightClient to implement parallel reading of multiple
Endpoints, using the method `sqlClient=new FlightSqlClient,
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(),
for(FlightEndpoint endpoint: endpoints)`. In addition, the default
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQue [...]
-5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the
database name in the URL. For example,
`jdbc:arrow-flight-sql://0.0.0.0:9090/test?useServerPrepStmts=false` specifies
that the connection to the `test` database is invalid. You can only execute the
SQL `use database` manually.
+5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the
database name in the URL. For example,
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
specifies that the connection to the `test` database is invalid. You can only
execute the SQL `use database` manually.
+
+6. There is a bug in Doris 2.1.4 version. There is a chance of error when
reading large amounts of data. This bug is fixed in [Fix arrow flight result
sink #36827](https://github.com/apache/doris/pull/36827) PR. Upgrading Doris
2.1.5 version can solve this problem. For details of the problem, see:
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+
+7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant`
Ignore this warning when using Python. This is a problem with the Python ADBC
Client and will not affect the query.
+
+8. Python reports an error `grpc: received message larger than max (20748753
vs. 16777216)`. Refer to [Python: grpc: received message larger than max
(20748753 vs. 16777216)
#2078](https://github.com/apache/arrow-adbc/issues/2078) to add
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database
Option.
+
+9. Before Doris version 2.1.7, the error `Reach limit of connections` is
reported. This is because there is no limit on the number of Arrow Flight
connections for a single user, which is less than `max_user_connections` in
`UserProperty`, which is 100 by default. You can modify the current maximum
number of connections for Billie user to 100 by `SET PROPERTY FOR 'Billie'
'max_user_connections' = '1000';`, or add `arrow_flight_token_cache_size=50` in
`fe.conf` to limit the overall number [...]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]