Hello everyone, I'm using scala and spark with the version 3.4.1 in Windows 10. While streaming using Spark, I give the `cleanSource` option as "archive" and the `sourceArchiveDir` option as "archived" as in the code below.
``` spark.readStream .option("cleanSource", "archive") .option("sourceArchiveDir", "archived") .option("enforceSchema", false) .option("header", includeHeader) .option("inferSchema", inferSchema) .options(otherOptions) .schema(csvSchema.orNull) .csv(FileUtils.getPath(sourceSettings.dataFolderPath, mappingSource.path).toString) ``` The code ```FileUtils.getPath(sourceSettings.dataFolderPath, mappingSource.path)``` returns a relative path like: test-data\streaming-folder\patients When I start stream, spark does not move source csv to archive folder. After working on it a bit, I started debugging the spark source codes. I found the ```override protected def cleanTask(entry: FileEntry): Unit``` method in the `FileStreamSource.scala` file in the `org.apache.spark.sql.execution.streaming` package. On line 569, the ```!fileSystem.rename(curPath, newPath)``` code supposed to move source file to archive folder. However, when I debugged, I noticed that the curPath and newPath values were as follows: **curPath**: `file:/C:/dev/be/data-integration-suite/test-data/streaming-folder/patients/patients-success.csv` **newPath**: `file:/C:/dev/be/data-integration-suite/archived/C:/dev/be/data-integration-suite/test-data/streaming-folder/patients/patients-success.csv` It seems that absolute path of csv file were appended when creating `newPath` because there are two `C:/dev/be/data-integration-suite` in the newPath. This is the reason spark archiving does not work. Instead, newPath should be: `file:/C:/dev/be/data-integration-suite/archived/test-data/streaming-folder/patients/patients-success.csv`. I guess this is more related to spark library and maybe it's a spark related bug? Is there any workaround or spark config to overcome this problem? Thanks Best regards, Yunus Emre