Re: Spark Structured Streaming org.apache.spark.sql.functions.input_file_name Intermittently Missing FileName

2021-10-12 Thread Alchemist
Looks like somehow related to API unable to send data from executor to driver If I set spark master to local I get these 6 files When spark.master is local&  InputReportAndFileName fileName file:///Users/abc/Desktop/test/Streaming/d&  InputReportAndFileName fileName file:

Re: Spark Structured Streaming org.apache.spark.sql.functions.input_file_name Intermittently Missing FileName

2021-10-12 Thread Alchemist
Here is Spark's API definition, unable to understand what does it mean to have "unknown" file.  We are processing file we will have fileName I have 7 files it can print 3 and miss other 4     /**       * Returns the holding file name or empty string if it is unknown.    */        def getInp

Spark Structured Streaming org.apache.spark.sql.functions.input_file_name Intermittently Missing FileName

2021-10-11 Thread Alchemist
Hello all, I am trying to extract file name like following but intermittanly we are getting empty file name. Step 1: Get SchemaStructType jsonSchema = sparkSession.read() .option("multiLine", true) .json("src/main/resources/sample.json") .schema();Step2: Get Input DataSetDataset inputDS = sparkS