LiJie20190102 opened a new issue, #10123:
URL: https://github.com/apache/seatunnel/issues/10123

   When I retrieve data from Starrocks, the configuration is as follows:
   
   `env {
     parallelism = 1
     job.mode = "BATCH"
   
     # You can set spark configuration here
     spark.app.name = "SeaTunnel"
     spark.executor.instances = 2
     spark.executor.cores = 1
     spark.executor.memory = "1g"
     spark.master = local
   }
   
   source {
     StarRocks {
       nodeUrls = ["10.151.15.12:32529"]
       username = testyy
       password = "Ds@81905202"
       database = "db01"
       table = "user"
       max_retries = 3
       schema {
         fields {
           name1 = STRING
           name2 = STRING
           name3 = STRING
         }
       }
     }
   }
   
   transform {
   }
   
   sink {
     # choose stdout output plugin to output data to console
     Console {
       parallelism = 1
     }
   }`
   
   At this point, an error log will appear:
   `Caused by: 
org.apache.seatunnel.connectors.seatunnel.starrocks.exception.StarRocksConnectorException:
 ErrorCode:[STARROCKS-04], ErrorDescription:[Create StarRocks BE reader failed] 
- Failed to open socket
        at 
org.apache.seatunnel.connectors.seatunnel.starrocks.client.source.StarRocksBeReadClient.<init>(StarRocksBeReadClient.java:88)
        at 
org.apache.seatunnel.connectors.seatunnel.starrocks.source.StarRocksSourceReader.read(StarRocksSourceReader.java:113)
        at 
org.apache.seatunnel.connectors.seatunnel.starrocks.source.StarRocksSourceReader.pollNext(StarRocksSourceReader.java:75)
        at 
org.apache.seatunnel.translation.source.ParallelSource.run(ParallelSource.java:144)
        at 
org.apache.seatunnel.translation.spark.source.partition.batch.ParallelBatchPartitionReader.lambda$prepare$0(ParallelBatchPartitionReader.java:117)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
        at java.util.concurrent.FutureTask.run(FutureTask.java)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        ... 3 more
   Caused by: 
com.starrocks.shade.org.apache.thrift.transport.TTransportException: 
java.net.UnknownHostException: 
pingt-7f5cf4cfdc-cn-0.pingt-7f5cf4cfdc-cn-headless.olap-c9cd-11f0-aec5-c245473c61c7-61f3bdda.svc.cluster.local
        at 
com.starrocks.shade.org.apache.thrift.transport.TSocket.open(TSocket.java:226)
        at 
org.apache.seatunnel.connectors.seatunnel.starrocks.client.source.StarRocksBeReadClient.<init>(StarRocksBeReadClient.java:85)
        ... 12 more
   Caused by: java.net.UnknownHostException: 
pingt-7f5cf4cfdc-cn-0.pingt-7f5cf4cfdc-cn-headless.olap-c9cd-11f0-aec5-c245473c61c7-61f3bdda.svc.cluster.local
        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at 
com.starrocks.shade.org.apache.thrift.transport.TSocket.open(TSocket.java:221)
        ... 13 more`
   
   I noticed that the code obtains the host and port 9060 of BE through the FE 
address, and returns the domain name. However, our starrocks are deployed 
through k8s and are not in the same cluster as our Spark cluster, which can 
cause an exception:
   
   <img width="1379" height="344" alt="Image" 
src="https://github.com/user-attachments/assets/bb1e1874-aff5-4c3a-940f-dcb9e029f988";
 />
   
   My solution idea is that the host and port 9060 of BE can be obtained 
through configuration, or if the user has not configured this information, it 
can be obtained through FE information. What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to