You can use Apache POI DateUtil to convert double to Date (https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DateUtil.html). Alternatively you can try HadoopOffice (https://github.com/ZuInnoTe/hadoopoffice/wiki), it supports Spark 1.x or Spark 2.0 ds.
> On 16. Aug 2017, at 20:15, Aakash Basu <aakash.spark....@gmail.com> wrote: > > Hey Irving, > > Thanks for a quick revert. In Excel that column is purely string, I actually > want to import that as a String and later play around the DF to convert it > back to date type, but the API itself is not allowing me to dynamically > assign a Schema to the DF and I'm forced to inferSchema, where itself, it is > converting all numeric columns to double (Though, I don't know how then the > date column is getting converted to double if it is string in the Excel > source). > > Thanks, > Aakash. > > > On 16-Aug-2017 11:39 PM, "Irving Duran" <irving.du...@gmail.com> wrote: > I think there is a difference between the actual value in the cell and what > Excel formats that cell. You probably want to import that field as a string > or not have it as a date format in Excel. > > Just a thought.... > > > Thank You, > > Irving Duran > >> On Wed, Aug 16, 2017 at 12:47 PM, Aakash Basu <aakash.spark....@gmail.com> >> wrote: >> Hey all, >> >> Forgot to attach the link to the overriding Schema through external >> package's discussion. >> >> https://github.com/crealytics/spark-excel/pull/13 >> >> You can see my comment there too. >> >> Thanks, >> Aakash. >> >>> On Wed, Aug 16, 2017 at 11:11 PM, Aakash Basu <aakash.spark....@gmail.com> >>> wrote: >>> Hi all, >>> >>> I am working on PySpark (Python 3.6 and Spark 2.1.1) and trying to fetch >>> data from an excel file using >>> spark.read.format("com.crealytics.spark.excel"), but it is inferring double >>> for a date type column. >>> >>> The detailed description is given here (the question I posted) - >>> >>> https://stackoverflow.com/questions/45713699/inferschema-using-spark-read-formatcom-crealytics-spark-excel-is-inferring-d >>> >>> >>> Found it is a probable bug with the crealytics excel read package. >>> >>> Can somebody help me with a workaround for this? >>> >>> Thanks, >>> Aakash. >> > >