[I] [spark] Unstable test "read: log partition table" of SparkStreamingTest [fluss]

via GitHub Mon, 09 Feb 2026 00:27:38 -0800


wuchong opened a new issue, #2607:
URL: https://github.com/apache/fluss/issues/2607


   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/fluss/issues) and 
found nothing similar.
   
   
   ### Fluss version
   
   0.8.0 (latest release)
   
   ### Please describe the bug 🐞
   
   
https://github.com/apache/fluss/actions/runs/21810772932/job/62922188555?pr=2545
   
   ```
   SparkStreamingTest:
   - write: write to log table
   - write: write to primary key table
   - read: log table
   - read: log partition table *** FAILED ***
     == Results ==
     !== Correct Answer - 2 ==   == Spark Answer - 1 ==
     !struct<>                   struct<id:int,data:string,pt:string>
     ![4,data4,22]               [5,data5,11]
     ![5,data5,11]               
         
     
     == Progress ==
        AssertOnQuery(<condition>, )
        CheckLastBatch: 
        StopStream
        
StartStream(ProcessingTimeTrigger(0),org.apache.spark.util.SystemClock@26aab903,Map(),null)
        
AddFlussData(t,StructType(StructField(id,IntegerType,true),StructField(data,StringType,true),StructField(pt,StringType,true)),List([4,data4,22],
 [5,data5,11]))
        AssertOnQuery(<condition>, )
     => CheckLastBatch: [4,data4,22],[5,data5,11]
        CheckAnswer: [4,data4,22],[5,data5,11]
     
     == Stream ==
     Output Mode: Append
     Stream state: 
{org.apache.fluss.spark.read.FlussAppendMicroBatchStream@74c8c7dd: 
{"version":1,"table_id":3,"partition_offsets":[{"partition_id":0,"bucket_offsets":[3]},{"partition_id":1,"bucket_offsets":[2]}]}}
     Thread state: alive
     Thread stack trace: [email protected]/java.lang.Thread.sleep(Native Method)
     
app//org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:308)
     
app//org.apache.spark.sql.execution.streaming.MicroBatchExecution$$Lambda$5514/0x0000000801d60040.apply$mcZ$sp(Unknown
 Source)
     
app//org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:67)
     
app//org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:239)
     
app//org.apache.spark.sql.execution.streaming.StreamExecution.$anonfun$runStream$1(StreamExecution.scala:311)
     
app//org.apache.spark.sql.execution.streaming.StreamExecution$$Lambda$5506/0x0000000801d54840.apply$mcV$sp(Unknown
 Source)
     app//scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
     app//org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
     
app//org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:289)
     
app//org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.$anonfun$run$1(StreamExecution.scala:211)
     
app//org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1$$Lambda$5503/0x0000000801d53440.apply$mcV$sp(Unknown
 Source)
     app//scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
     
app//org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:94)
     
app//org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:211)
     
     
     == Sink ==
     0: 
     1: [4,data4,22]
     2: [5,data5,11]
     
     
     == Plan ==
     == Parsed Logical Plan ==
     WriteToMicroBatchDataSource MemorySink, 
dbc58821-15c7-4496-87ad-75c931741811, Append, 2
     +- SubqueryAlias fluss_catalog.fluss.t
        +- StreamingDataSourceV2Relation [id#866, data#867, pt#868], 
fluss_catalog.fluss.t, FlussAppendScan(fluss.t,TableInfo{tablePath=fluss.t, 
tableId=3, schemaId=1, schema=Schema{columns=[id INT, data STRING, pt STRING], 
primaryKey=null, autoIncrementColumnNames=[], highestFieldId=2}, 
physicalPrimaryKeys=[], bucketKeys=[], partitionKeys=[pt], numBuckets=1, 
properties={table.replication.factor=1}, customProperties={owner=fluss}, 
comment='null', createdTime=1770608954778, 
modifiedTime=1770608954778},None,org.apache.spark.sql.util.CaseInsensitiveStringMap@4d5141e1,{bootstrap.servers=127.0.0.1:44477}),
 org.apache.fluss.spark.read.FlussAppendMicroBatchStream@74c8c7dd, 
{"version":1,"table_id":3,"partition_offsets":[{"partition_id":0,"bucket_offsets":[2]},{"partition_id":1,"bucket_offsets":[2]}]},
 
{"version":1,"table_id":3,"partition_offsets":[{"partition_id":0,"bucket_offsets":[3]},{"partition_id":1,"bucket_offsets":[2]}]}
     
     == Analyzed Logical Plan ==
     WriteToMicroBatchDataSource MemorySink, 
dbc58821-15c7-4496-87ad-75c931741811, Append, 2
     +- SubqueryAlias fluss_catalog.fluss.t
        +- StreamingDataSourceV2Relation [id#866, data#867, pt#868], 
fluss_catalog.fluss.t, FlussAppendScan(fluss.t,TableInfo{tablePath=fluss.t, 
tableId=3, schemaId=1, schema=Schema{columns=[id INT, data STRING, pt STRING], 
primaryKey=null, autoIncrementColumnNames=[], highestFieldId=2}, 
physicalPrimaryKeys=[], bucketKeys=[], partitionKeys=[pt], numBuckets=1, 
properties={table.replication.factor=1}, customProperties={owner=fluss}, 
comment='null', createdTime=1770608954778, 
modifiedTime=1770608954778},None,org.apache.spark.sql.util.CaseInsensitiveStringMap@4d5141e1,{bootstrap.servers=127.0.0.1:44477}),
 org.apache.fluss.spark.read.FlussAppendMicroBatchStream@74c8c7dd, 
{"version":1,"table_id":3,"partition_offsets":[{"partition_id":0,"bucket_offsets":[2]},{"partition_id":1,"bucket_offsets":[2]}]},
 
{"version":1,"table_id":3,"partition_offsets":[{"partition_id":0,"bucket_offsets":[3]},{"partition_id":1,"bucket_offsets":[2]}]}
     
     == Optimized Logical Plan ==
     WriteToDataSourceV2 MicroBatchWrite[epoch: 2, writer: 
org.apache.spark.sql.execution.streaming.sources.MemoryStreamingWrite@65e73b2e]
     +- StreamingDataSourceV2Relation [id#866, data#867, pt#868], 
fluss_catalog.fluss.t, FlussAppendScan(fluss.t,TableInfo{tablePath=fluss.t, 
tableId=3, schemaId=1, schema=Schema{columns=[id INT, data STRING, pt STRING], 
primaryKey=null, autoIncrementColumnNames=[], highestFieldId=2}, 
physicalPrimaryKeys=[], bucketKeys=[], partitionKeys=[pt], numBuckets=1, 
properties={table.replication.factor=1}, customProperties={owner=fluss}, 
comment='null', createdTime=1770608954778, 
modifiedTime=1770608954778},None,org.apache.spark.sql.util.CaseInsensitiveStringMap@4d5141e1,{bootstrap.servers=127.0.0.1:44477}),
 org.apache.fluss.spark.read.FlussAppendMicroBatchStream@74c8c7dd, 
{"version":1,"table_id":3,"partition_offsets":[{"partition_id":0,"bucket_offsets":[2]},{"partition_id":1,"bucket_offsets":[2]}]},
 
{"version":1,"table_id":3,"partition_offsets":[{"partition_id":0,"bucket_offsets":[3]},{"partition_id":1,"bucket_offsets":[2]}]}
     
     == Physical Plan ==
     WriteToDataSourceV2 MicroBatchWrite[epoch: 2, writer: 
org.apache.spark.sql.execution.streaming.sources.MemoryStreamingWrite@65e73b2e],
 
org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy$$Lambda$5584/0x0000000801d86c40@388d5d79
     +- *(1) Project [id#866, data#867, pt#868]
        +- MicroBatchScan[id#866, data#867, pt#868] class 
org.apache.fluss.spark.read.FlussAppendScan (StreamTest.scala:462)
   - read: primary key table
   - read: primary key partition table
   
   ```
   
   ### Solution
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [spark] Unstable test "read: log partition table" of SparkStreamingTest [fluss]

Reply via email to