Hey Irving, Thanks for a quick revert. In Excel that column is purely string, I actually want to import that as a String and later play around the DF to convert it back to date type, but the API itself is not allowing me to dynamically assign a Schema to the DF and I'm forced to inferSchema, where itself, it is converting all numeric columns to double (Though, I don't know how then the date column is getting converted to double if it is string in the Excel source).
Thanks, Aakash. On 16-Aug-2017 11:39 PM, "Irving Duran" <irving.du...@gmail.com> wrote: I think there is a difference between the actual value in the cell and what Excel formats that cell. You probably want to import that field as a string or not have it as a date format in Excel. Just a thought.... Thank You, Irving Duran On Wed, Aug 16, 2017 at 12:47 PM, Aakash Basu <aakash.spark....@gmail.com> wrote: > Hey all, > > Forgot to attach the link to the overriding Schema through external > package's discussion. > > https://github.com/crealytics/spark-excel/pull/13 > > You can see my comment there too. > > Thanks, > Aakash. > > On Wed, Aug 16, 2017 at 11:11 PM, Aakash Basu <aakash.spark....@gmail.com> > wrote: > >> Hi all, >> >> I am working on PySpark (*Python 3.6 and Spark 2.1.1*) and trying to >> fetch data from an excel file using >> *spark.read.format("com.crealytics.spark.excel")*, but it is inferring >> double for a date type column. >> >> The detailed description is given here (the question I posted) - >> >> https://stackoverflow.com/questions/45713699/inferschema-usi >> ng-spark-read-formatcom-crealytics-spark-excel-is-inferring-d >> >> >> Found it is a probable bug with the crealytics excel read package. >> >> Can somebody help me with a workaround for this? >> >> Thanks, >> Aakash. >> > >