I found the reason why it did not work: When returning the Spark data type I was calling new StringType(). When changing it to DataTypes.StringType it worked.
Greets, Rico. > Am 17.02.2022 um 14:13 schrieb Gourav Sengupta <gourav.sengu...@gmail.com>: > > > Hi, > > can you please post a screen shot of the exact CAST statement that you are > using? Did you use the SQL method mentioned by me earlier? > > Regards, > Gourav Sengupta > >> On Thu, Feb 17, 2022 at 12:17 PM Rico Bergmann <i...@ricobergmann.de> wrote: >> hi! >> >> Casting another int column that is not a partition column fails with the >> same error. >> >> The Schema before the cast (column names are anonymized): >> >> root >> |-- valueObject: struct (nullable = true) >> | |-- value1: string (nullable = true) >> | |-- value2: string (nullable = true) >> | |-- value3: timestamp (nullable = true) >> | |-- value4: string (nullable = true) >> |-- partitionColumn2: string (nullable = true) >> |-- partitionColumn3: timestamp (nullable = true) >> |-- partitionColumn1: integer (nullable = true) >> >> I wanted to cast partitionColumn1 to String which gives me the described >> error. >> >> Best, >> Rico >> >> >>>> Am 17.02.2022 um 09:56 schrieb ayan guha <guha.a...@gmail.com>: >>>> >>> >>> Can you try to cast any other Int field which is NOT a partition column? >>> >>>> On Thu, 17 Feb 2022 at 7:34 pm, Gourav Sengupta >>>> <gourav.sengu...@gmail.com> wrote: >>>> Hi, >>>> >>>> This appears interesting, casting INT to STRING has never been an issue >>>> for me. >>>> >>>> Can you just help us with the output of : df.printSchema() ? >>>> >>>> I prefer to use SQL, and the method I use for casting is: CAST(<<column >>>> name>> AS STRING) <<alias>>. >>>> >>>> Regards, >>>> Gourav >>>> >>>> >>>> >>>> >>>> >>>> >>>>> On Thu, Feb 17, 2022 at 6:02 AM Rico Bergmann <i...@ricobergmann.de> >>>>> wrote: >>>>> Here is the code snippet: >>>>> >>>>> var df = session.read().parquet(basepath); >>>>> for(Column partition : partitionColumnsList){ >>>>> df = df.withColumn(partition.getName(), >>>>> df.col(partition.getName()).cast(partition.getType())); >>>>> } >>>>> >>>>> Column is a class containing Schema Information, like for example the >>>>> name of the column and the data type of the column. >>>>> >>>>> Best, Rico. >>>>> >>>>> > Am 17.02.2022 um 03:17 schrieb Morven Huang <morven.hu...@gmail.com>: >>>>> > >>>>> > Hi Rico, you have any code snippet? I have no problem casting int to >>>>> > string. >>>>> > >>>>> >> 2022年2月17日 上午12:26,Rico Bergmann <i...@ricobergmann.de> 写道: >>>>> >> >>>>> >> Hi! >>>>> >> >>>>> >> I am reading a partitioned dataFrame into spark using automatic type >>>>> >> inference for the partition columns. For one partition column the data >>>>> >> contains an integer, therefor Spark uses IntegerType for this column. >>>>> >> In general this is supposed to be a StringType column. So I tried to >>>>> >> cast this column to StringType. But this fails with AnalysisException >>>>> >> “cannot cast int to string”. >>>>> >> >>>>> >> Is this a bug? Or is it really not allowed to cast an int to a string? >>>>> >> >>>>> >> I’m using Spark 3.1.1 >>>>> >> >>>>> >> Best regards >>>>> >> >>>>> >> Rico. >>>>> >> >>>>> >> --------------------------------------------------------------------- >>>>> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>> >> >>>>> > >>>>> > >>>>> > --------------------------------------------------------------------- >>>>> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>> > >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>> >>> -- >>> Best Regards, >>> Ayan Guha