Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Aakash Basu Wed, 16 Aug 2017 11:15:48 -0700

Hey Irving,

Thanks for a quick revert. In Excel that column is purely string, I
actually want to import that as a String and later play around the DF to
convert it back to date type, but the API itself is not allowing me to
dynamically assign a Schema to the DF and I'm forced to inferSchema, where
itself, it is converting all numeric columns to double (Though, I don't
know how then the date column is getting converted to double if it is
string in the Excel source).


Thanks,
Aakash.


On 16-Aug-2017 11:39 PM, "Irving Duran" <irving.du...@gmail.com> wrote:

I think there is a difference between the actual value in the cell and what
Excel formats that cell.  You probably want to import that field as a
string or not have it as a date format in Excel.

Just a thought....


Thank You,

Irving Duran

On Wed, Aug 16, 2017 at 12:47 PM, Aakash Basu <aakash.spark....@gmail.com>
wrote:

> Hey all,
>
> Forgot to attach the link to the overriding Schema through external
> package's discussion.
>
> https://github.com/crealytics/spark-excel/pull/13
>
> You can see my comment there too.
>
> Thanks,
> Aakash.
>
> On Wed, Aug 16, 2017 at 11:11 PM, Aakash Basu <aakash.spark....@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I am working on PySpark (*Python 3.6 and Spark 2.1.1*) and trying to
>> fetch data from an excel file using
>> *spark.read.format("com.crealytics.spark.excel")*, but it is inferring
>> double for a date type column.
>>
>> The detailed description is given here (the question I posted) -
>>
>> https://stackoverflow.com/questions/45713699/inferschema-usi
>> ng-spark-read-formatcom-crealytics-spark-excel-is-inferring-d
>>
>>
>> Found it is a probable bug with the crealytics excel read package.
>>
>> Can somebody help me with a workaround for this?
>>
>> Thanks,
>> Aakash.
>>
>
>

Re: Reading Excel (.xlsm) file through PySpark 2.1.1 with external JAR is causing fatal conversion of data type

Reply via email to