akashorabek commented on PR #30800:
URL: https://github.com/apache/beam/pull/30800#issuecomment-2031360719

   > Looks like it has only 100-50 MB/s even for 20 workers,
   > 
   > and there are errors in log:
   > 
   > ```
   > DEADLINE_EXCEEDED writing batch of 25 mutations to Cloud Spanner, retrying 
after backoff of 9137ms
   > (DEADLINE_EXCEEDED: com.google.api.gax.rpc.DeadlineExceededException: 
io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: Deadline Exceeded)
   > ```
   > 
   > looks like write is throttled on the spanner side. If double the number of 
units of the spanner instance, would the throughput show difference?
   > 
   > The stress test is testing the capability of Beam IO, so we want to 
Spanner side has sufficient capability
   > 
   > Also need to run spotlessApply to clear the PreCommit failure
   
   Yeah, you were right about throttling on the spanner side. So I tested 
different number of spanner instance nodes and here are the results:
   
   - [5 
nodes.](https://console.cloud.google.com/dataflow/jobs/us-central1/2024-04-01_01_31_36-2551455892432405596;step=Write%20to%20Spanner;mainTab=JOB_GRAPH;bottomTab=WORKER_LOGS;bottomStepTab=DATA_SAMPLING;logsSeverity=INFO;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22)))
 Average throughput is 250-400 MB/s
   - [20 
nodes.](https://console.cloud.google.com/dataflow/jobs/us-central1/2024-04-01_23_29_28-17873229330511088299;step=Write%20to%20Spanner;mainTab=JOB_GRAPH;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22)))
 Throughput is 500-750 MB/s
   - [30 
nodes.](https://console.cloud.google.com/dataflow/jobs/us-central1/2024-04-01_12_21_23-12244802819720839592;step=Write%20to%20Spanner;mainTab=JOB_GRAPH;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22)))
 Throughput is 600-900 MB/s
   - [40 
nodes.](https://console.cloud.google.com/dataflow/jobs/us-central1/2024-04-02_00_15_40-17710127828952119517;step=Write%20to%20Spanner;mainTab=JOB_GRAPH;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22)))
 Basically the same as wtih 30 nodes. 
   
   Since there is no much difference in performance after 30 nodes I decided to 
use this number. 
   Also fixed the DEADLINE_EXCEEDED warnings. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to