Re: [PySPark] How to check if value of one column is in array of another column

2023-01-18 Thread Oliver Ruebenacker
Awesome, thanks, this was exactly what I needed!

On Tue, Jan 17, 2023 at 5:23 PM Sean Owen  wrote:

> I think you want array_contains:
>
> https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.array_contains.html
>
> On Tue, Jan 17, 2023 at 4:18 PM Oliver Ruebenacker <
> oliv...@broadinstitute.org> wrote:
>
>>
>>  Hello,
>>
>>   I have data originally stored as JSON. Column gene contains a string,
>> column nearest an array of strings. How can I check whether the value of
>> gene is an element of the array of nearest?
>>
>>   I tried: genes_joined.gene.isin(genes_joined.nearest)
>>
>>   But I get an error that says:
>>
>> pyspark.sql.utils.AnalysisException: cannot resolve '(gene IN (nearest))'
>> due to data type mismatch: Arguments must be same type but were: string !=
>> array;
>>
>>   How do I do this? Thanks!
>>
>>  Best, Oliver
>>
>> --
>> Oliver Ruebenacker, Ph.D. (he)
>> Senior Software Engineer, Knowledge Portal Network , 
>> Flannick
>> Lab , Broad Institute
>> 
>>
>

-- 
Oliver Ruebenacker, Ph.D. (he)
Senior Software Engineer, Knowledge Portal Network
, Flannick
Lab , Broad Institute



Re: [PySPark] How to check if value of one column is in array of another column

2023-01-17 Thread Sean Owen
I think you want array_contains:
https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.array_contains.html

On Tue, Jan 17, 2023 at 4:18 PM Oliver Ruebenacker <
oliv...@broadinstitute.org> wrote:

>
>  Hello,
>
>   I have data originally stored as JSON. Column gene contains a string,
> column nearest an array of strings. How can I check whether the value of
> gene is an element of the array of nearest?
>
>   I tried: genes_joined.gene.isin(genes_joined.nearest)
>
>   But I get an error that says:
>
> pyspark.sql.utils.AnalysisException: cannot resolve '(gene IN (nearest))'
> due to data type mismatch: Arguments must be same type but were: string !=
> array;
>
>   How do I do this? Thanks!
>
>  Best, Oliver
>
> --
> Oliver Ruebenacker, Ph.D. (he)
> Senior Software Engineer, Knowledge Portal Network , 
> Flannick
> Lab , Broad Institute
> 
>