That error sounds like it's from pandas not spark. Are you sure it's this
line?

On Thu, Feb 23, 2023, 12:57 PM Oliver Ruebenacker <
oliv...@broadinstitute.org> wrote:

>
>      Hello,
>
>   I'm trying to calculate the distance between a gene (with start and end)
> and a variant (with position), so I joined gene and variant data by
> chromosome and then tried to calculate the distance like this:
>
> ```
> distances = joined.withColumn("distance", max(col("start") -
> col("position"), col("position") - col("end"), 0))
> ```
>
>   Basically, the distance is the maximum of three terms.
>
>   This line causes an obscure error:
>
> ```
> ValueError: Cannot convert column into bool: please use '&' for 'and', '|'
> for 'or', '~' for 'not' when building DataFrame boolean expressions.
> ```
>
>   How can I do this? Thanks!
>
>      Best, Oliver
>
> --
> Oliver Ruebenacker, Ph.D. (he)
> Senior Software Engineer, Knowledge Portal Network <http://kp4cd.org/>, 
> Flannick
> Lab <http://www.flannicklab.org/>, Broad Institute
> <http://www.broadinstitute.org/>
>

Reply via email to