Re: Filtering based on a float value with more than one decimal place not working correctly in Pyspark dataframe
Is this not just a case of floating-point literals not being exact? this is expressed in Python, not SQL. On Wed, Sep 26, 2018 at 12:46 AM Meethu Mathew wrote: > Hi all, > > I tried the following code and the output was not as expected. > > schema = StructType([StructField('Id', StringType(), False), >> StructField('Value', FloatType(), False)]) >> df_test = >> spark.createDataFrame([('a',5.0),('b',1.236),('c',-0.31)],schema) > > df_test > > > Output : DataFrame[Id: string, Value: float] > [image: image.png] > But when the value is given as a string, it worked. > > [image: image.png] > Again tried with a floating point number with one decimal place and it > worked. > [image: image.png] > And when the equals operation is changed to greater than or less than, its > working with more than one decimal place numbers > [image: image.png] > Is this a bug? > > Regards, > Meethu Mathew > > >
Re: Filtering based on a float value with more than one decimal place not working correctly in Pyspark dataframe
I think it is similar to the one SPARK-25452 Regards Sandeep Katta On Wed, 26 Sep 2018 at 11:16 AM, Meethu Mathew wrote: > Hi all, > > I tried the following code and the output was not as expected. > > schema = StructType([StructField('Id', StringType(), False), >> StructField('Value', FloatType(), False)]) >> df_test = >> spark.createDataFrame([('a',5.0),('b',1.236),('c',-0.31)],schema) > > df_test > > > Output : DataFrame[Id: string, Value: float] > [image: image.png] > But when the value is given as a string, it worked. > > [image: image.png] > Again tried with a floating point number with one decimal place and it > worked. > [image: image.png] > And when the equals operation is changed to greater than or less than, its > working with more than one decimal place numbers > [image: image.png] > Is this a bug? > > Regards, > > > Meethu Mathew > > >
Filtering based on a float value with more than one decimal place not working correctly in Pyspark dataframe
Hi all, I tried the following code and the output was not as expected. schema = StructType([StructField('Id', StringType(), False), > StructField('Value', FloatType(), False)]) > df_test = spark.createDataFrame([('a',5.0),('b',1.236),('c',-0.31)],schema) df_test Output : DataFrame[Id: string, Value: float] [image: image.png] But when the value is given as a string, it worked. [image: image.png] Again tried with a floating point number with one decimal place and it worked. [image: image.png] And when the equals operation is changed to greater than or less than, its working with more than one decimal place numbers [image: image.png] Is this a bug? Regards, Meethu Mathew