jerryshiba opened a new issue, #38249:
URL: https://github.com/apache/doris/issues/38249

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   doris-2.1.4-rc03
   hadoop 3.3.6
   minio RELEASE.2021-04-18T19-26-29Z
   hive standalone metastore 3.1.3
   CDH 6.3.2
   
   ### What's Wrong?
   
   Starting from version 2.1.3, Doris supports DDL and DML operations for Hive, 
HDFS能正常寫入, 但是MinIO s3無法寫入
   
   ### What You Expected?
   
    HDFS能正常寫入, MinIO s3也要能正常寫入
   
   ### How to Reproduce?
   
   事前準備
   MinIO需先安裝好不啟SSL(版本如上方)
   
   A. 下載Hadoop 3.3.6安裝但不啟動
   1.下載Hadoop 3.3.6
   
https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz
   2.解壓縮到/hadoop/hadoop-3.3.6
   
   
   B. 安裝並啟動Hive standalone metastore
   1.下載Hive standalone metastore 3.1.3
   
https://repo1.maven.org/maven2/org/apache/hive/hive-standalone-metastore/3.1.3/hive-standalone-metastore-3.1.3-bin.tar.gz
   2. 解壓縮/hadoop/hive-metastore-3.1.3
   3. 安裝相關aws lib
   `
   ##Download 
https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-core/1.12.762
   # cp aws-java-sdk-core-1.12.762.jar /hadoop/hive-metastore-3.1.3/lib/
   # cp /hadoop/hadoop-3.3.6/share/hadoop/tools/lib/hadoop-aws-3.3.6.jar 
/hadoop/hive-metastore-3.1.3/lib/
   # cp 
/hadoop/hadoop-3.3.6/share/hadoop/tools/lib/aws-java-sdk-bundle-1.12.367.jar 
/hadoop/hive-metastore-3.1.3/lib/
   `
   4.編輯metastore-site.xml
   `
   #vi /hadoop/hive-metastore-3.1.3/conf/metastore-site.xml
   ...
   <property>
       <name>metastore.thrift.uris</name>
       <value>thrift://<NODE_IP>:9083</value>
       <description>Thrift URI for the remote metastore. Used by metastore 
client to connect to remote metastore.</description>
     </property>
     <property>
       <name>hive.metastore.schema.verification</name>
       <value>false</value>
     </property>
     <property>
       <name>datanucleus.schema.autoCreateAll</name>
       <value>true</value>
     </property>
   ...
    `
   5.編輯core-site.xml
   `
   # vi /hadoop/hadoop-3.3.6/etc/hadoop/core-site.xml
   ...
     <property>
       <name>fs.defaultFS</name>
       <value>s3a://testbucket</value>
     </property>
     <property>
       <name>fs.s3a.access.key</name>
       <value>***</value>
     </property>
     <property>
       <name>fs.s3a.secret.key</name>
       <value>***</value>
     </property>
     <property>
       <name>fs.s3a.endpoint</name>
       <value>http://<NODE:IP>:9900</value>
     </property>
     <property>
       <name>fs.s3a.connection.ssl.enabled</name>
       <value>false</value>
     </property>
   <property>
       <name>fs.s3a.path.style.access</name>
       <value>true</value>
     </property>
   <property>
       <name>fs.s3a.checksum.validation</name>
       <value>false</value>
     </property>
     <property>
       <name>fs.s3a.endpoint.region</name>
       <value>us-east-1</value>
     </property>
     <property>
       <name>fs.s3a.fast.upload</name>
       <value>true</value>
     </property>
   ...
   `
   6.編輯start-metasotre
   `
   vi /hadoop/hive-metastore-3.1.3/bin/start-metastore
   export HADOOP_HOME=/hadoop/hadoop-3.3.6
   `
   7.啟動hive metastore
   `
   # cd /hadoop/hive-metastore-3.1.3/bin
   # ./start-metastore
   `
   
   C. Run Doris SQL
   1. CREATE CATALOG
   CREATE CATALOG hive PROPERTIES (
       "type"="hms",
       "hive.metastore.uris" = "thrift://<NODE_IP>:9083",
       "fs.defaultFS" = "s3a://testbucket",
       "s3.region" = "us-east-1",
       "s3.endpoint" = "http://<NODE_IP>:9900",
       "s3.access_key" = "***",
       "s3.secret_key" = "***",
       "use_path_style" = "true"
    );
   2. CREATE DATABASE
   CREATE DATABASE hive.jerry_hive_db;
   3. CREATE TABLE
   CREATE TABLE hive.jerry_hive_db.jerry_hive_table(
     id INT,
     name String,
     year INT,
     month INT
   ) ENGINE=hive
   PARTITION BY LIST (year,month) ()
   PROPERTIES ('file_format'='parquet','transactional'='false');
   4. 插入數據會爆錯
   INSERT INTO hive.jerry_hive_db.jerry_hive_table VALUES(1,"Jerry",2024,2);
   ERROR 1105 (HY000): errCode = 2, detailMessage = Not support insert target 
table
   但是在CREATE DATABASE和CREATE TABLE時,MinIO UI上會顯示有建立相關的路徑(如圖)
   該路徑會和SHOW CREATE TABLE hive.jerry_hive_db.jerry_hive_table;的LOCATION一致
   LOCATION
     's3a://testbucket/user/hive/warehouse/jerry_hive_db.db/jerry_hive_table'
   
![image](https://github.com/user-attachments/assets/36843010-9199-4db7-9221-21bcdb942559)
   
   
   ----------------------------------------------------------
   補充CDH 6.3.2 整合HDFS(無kerberos)能正常寫入
   差異部分
   1.修改core-site.xml
   `
   # vi /hadoop/hadoop-3.3.6/etc/hadoop/core-site.xml
   ...
     <property>
       <name>fs.defaultFS</name>
       <value>hdfs://<NAMENODE_IP>:8020</value>
     </property>
   ...
   `
   2.Run Doris SQL
   CREATE CATALOG hive PROPERTIES (
       "type"="hms",
       "hive.metastore.uris" = "thrift://<NODE_IP>:9083",
       "hadoop.username" = "hadoop",
       "fs.defaultFS" = "hdfs://<NAMENODE_IP>:8020"
       );
   CREATE DATABASE hive.jerry_hive_db;
   CREATE TABLE hive.jerry_hive_db.jerry_hive_table(
     id INT,
     name String,
     year INT,
     month INT
   ) ENGINE=hive
   PARTITION BY LIST (year,month) ()
   PROPERTIES ('file_format'='parquet','transactional'='false');
   
   INSERT INTO hive.jerry_hive_db.jerry_hive_table VALUES(1,"Jerry",2024,2);
   
   成功寫入HDFS後可以發現
   a. HDFS路徑會長出/user/hive/warehouse/jerry_hive_db.db/jerry_hive_table
   b. 並且路徑底下會有year=2024/month=2等partition路徑
   c. 最底層有parq檔
   d. 能正常select出數據
   
   
   
   
   
   
   
   
   ### Anything Else?
   
   N
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to