LiJie20190102 opened a new issue, #10123:
URL: https://github.com/apache/seatunnel/issues/10123
When I retrieve data from Starrocks, the configuration is as follows:
`env {
parallelism = 1
job.mode = "BATCH"
# You can set spark configuration here
spark.app.name = "SeaTunnel"
spark.executor.instances = 2
spark.executor.cores = 1
spark.executor.memory = "1g"
spark.master = local
}
source {
StarRocks {
nodeUrls = ["10.151.15.12:32529"]
username = testyy
password = "Ds@81905202"
database = "db01"
table = "user"
max_retries = 3
schema {
fields {
name1 = STRING
name2 = STRING
name3 = STRING
}
}
}
}
transform {
}
sink {
# choose stdout output plugin to output data to console
Console {
parallelism = 1
}
}`
At this point, an error log will appear:
`Caused by:
org.apache.seatunnel.connectors.seatunnel.starrocks.exception.StarRocksConnectorException:
ErrorCode:[STARROCKS-04], ErrorDescription:[Create StarRocks BE reader failed]
- Failed to open socket
at
org.apache.seatunnel.connectors.seatunnel.starrocks.client.source.StarRocksBeReadClient.<init>(StarRocksBeReadClient.java:88)
at
org.apache.seatunnel.connectors.seatunnel.starrocks.source.StarRocksSourceReader.read(StarRocksSourceReader.java:113)
at
org.apache.seatunnel.connectors.seatunnel.starrocks.source.StarRocksSourceReader.pollNext(StarRocksSourceReader.java:75)
at
org.apache.seatunnel.translation.source.ParallelSource.run(ParallelSource.java:144)
at
org.apache.seatunnel.translation.spark.source.partition.batch.ParallelBatchPartitionReader.lambda$prepare$0(ParallelBatchPartitionReader.java:117)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
at java.util.concurrent.FutureTask.run(FutureTask.java)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
... 3 more
Caused by:
com.starrocks.shade.org.apache.thrift.transport.TTransportException:
java.net.UnknownHostException:
pingt-7f5cf4cfdc-cn-0.pingt-7f5cf4cfdc-cn-headless.olap-c9cd-11f0-aec5-c245473c61c7-61f3bdda.svc.cluster.local
at
com.starrocks.shade.org.apache.thrift.transport.TSocket.open(TSocket.java:226)
at
org.apache.seatunnel.connectors.seatunnel.starrocks.client.source.StarRocksBeReadClient.<init>(StarRocksBeReadClient.java:85)
... 12 more
Caused by: java.net.UnknownHostException:
pingt-7f5cf4cfdc-cn-0.pingt-7f5cf4cfdc-cn-headless.olap-c9cd-11f0-aec5-c245473c61c7-61f3bdda.svc.cluster.local
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at
com.starrocks.shade.org.apache.thrift.transport.TSocket.open(TSocket.java:221)
... 13 more`
I noticed that the code obtains the host and port 9060 of BE through the FE
address, and returns the domain name. However, our starrocks are deployed
through k8s and are not in the same cluster as our Spark cluster, which can
cause an exception:
<img width="1379" height="344" alt="Image"
src="https://github.com/user-attachments/assets/bb1e1874-aff5-4c3a-940f-dcb9e029f988"
/>
My solution idea is that the host and port 9060 of BE can be obtained
through configuration, or if the user has not configured this information, it
can be obtained through FE information. What do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]