subject:"Reading Excel \(.xlsm\) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type"

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

2017-08-18 Thread Jörn Franke

You have forgotten a y: It must be MM/did/ > On 17. Aug 2017, at 21:30, Aakash Basu wrote: > > Hi Palwell, > > Tried doing that, but its becoming null for all the dates after the > transformation with functions. > > df2 =

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

2017-08-17 Thread Aakash Basu

Hi Palwell, Tried doing that, but its becoming null for all the dates after the transformation with functions. df2 = dflead.select('Enter_Date',f.to_date(df2.Enter_Date)) [image: Inline image 1] Any insight? Thanks, Aakash. On Fri, Aug 18, 2017 at 12:23 AM, Patrick Alwell

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

2017-08-17 Thread Aakash Basu

Hey all, Thanks! I had a discussion with the person who authored that package and informed about this bug, but in the meantime with the same thing, found a small tweak to ensure the job is done. Now that is fine, I'm getting the date as a string by predefining the Schema but I want to later

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

2017-08-16 Thread Jörn Franke

You can use Apache POI DateUtil to convert double to Date (https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DateUtil.html). Alternatively you can try HadoopOffice (https://github.com/ZuInnoTe/hadoopoffice/wiki), it supports Spark 1.x or Spark 2.0 ds. > On 16. Aug 2017, at 20:15,

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

2017-08-16 Thread Aakash Basu

Hey Irving, Thanks for a quick revert. In Excel that column is purely string, I actually want to import that as a String and later play around the DF to convert it back to date type, but the API itself is not allowing me to dynamically assign a Schema to the DF and I'm forced to inferSchema,

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

2017-08-16 Thread Irving Duran

I think there is a difference between the actual value in the cell and what Excel formats that cell. You probably want to import that field as a string or not have it as a date format in Excel. Just a thought Thank You, Irving Duran On Wed, Aug 16, 2017 at 12:47 PM, Aakash Basu

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

2017-08-16 Thread Aakash Basu

Hey all, Forgot to attach the link to the overriding Schema through external package's discussion. https://github.com/crealytics/spark-excel/pull/13 You can see my comment there too. Thanks, Aakash. On Wed, Aug 16, 2017 at 11:11 PM, Aakash Basu wrote: > Hi all, >

Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

2017-08-16 Thread Aakash Basu

Hi all, I am working on PySpark (*Python 3.6 and Spark 2.1.1*) and trying to fetch data from an excel file using *spark.read.format("com.crealytics.spark.excel")*, but it is inferring double for a date type column. The detailed description is given here (the question I posted) -

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

8 matches

Site Navigation

Mail list logo

Footer information