Hello all,
I am trying to extract file name like following but intermittanly we are 
getting empty file name.
Step 1: Get SchemaStructType jsonSchema = sparkSession.read() 
.option("multiLine", true) .json("src/main/resources/sample.json") 
.schema();Step2: Get Input DataSetDataset<Row> inputDS = sparkSession 
.readStream() .format("text") .option("multiLine", true) .schema(jsonSchema) 
.json(inputPath + "/*");Step3: Add fileName columnDataset<Row> inputDf= 
inputDS.select(functions.col("Report")).toJSON()  .withColumn("FileName", 
org.apache.spark.sql.functions.input_file_name());Step4: Print 
fileNameDataset<InputReportAndFileName> inputDF = inputDf 
.as(ExpressionEncoder.javaBean(InputReportAndFileName.class)).map((MapFunction<InputReportAndFileName,
 InputReportAndFileName>) inputReportAndFileName ->{  
System.out.println("&&&&&&&&&&&&&  InputReportAndFileName fileName " + 
inputReportAndFileName.getFileName()); return inputReportAndFileName;}, 
ExpressionEncoder.javaBean(InputReportAndFileName.class));
Output: Here we see missing fileName&&&&&&&&&&&&&  InputReportAndFileName 
fileName &&&&&&&&&&&&&  InputReportAndFileName fileName 
file:///Users/abc/Desktop/test/Streaming/2021-Aug-14-042000_001E46_1420254%202&&&&&&&&&&&&&
  InputReportAndFileName fileName 
file:///Users/abc/Desktop/test/Streaming/2021-Aug-14-042000_001E46_14202040&&&&&&&&&&&&&
  InputReportAndFileName fileName 
file:///Users/abc/Desktop/test/Streaming/2021-Aug-14-042000_001E46_142720%202&&&&&&&&&&&&&
  InputReportAndFileName fileName &&&&&&&&&&&&&  InputReportAndFileName 
fileName &&&&&&&&&&&&&  InputReportAndFileName fileName 

Reply via email to