Wow, you guys, Anastasios, Bjørn and Mich, are stars!
Thank you very much for your suggestions. I’m going to print them and study 
them closely.


> Le 2 avr. 2023 à 20:05, Anastasios Zouzias <zouz...@gmail.com> a écrit :
> 
> Hi Philippe,
> 
> I would like to draw your attention to this great library that saved my day 
> in the past when parsing phone numbers in Spark: 
> 
> https://github.com/google/libphonenumber
> 
> If you combine it with Bjørn's suggestions you will have a good start on your 
> linkage task.
> 
> Best regards,
> Anastasios Zouzias
> 
> 
> On Sat, Apr 1, 2023 at 8:31 PM Philippe de Rochambeau <phi...@free.fr 
> <mailto:phi...@free.fr>> wrote:
>> Hello,
>> I’m looking for an efficient way in Spark to search for a series of 
>> telephone numbers, contained in a CSV file, in a data set column.
>> 
>> In pseudo code,
>> 
>> for tel in [tel1, tel2, …. tel40,000] 
>>         search for tel in dataset using .like(« %tel% »)
>> end for 
>> 
>> I’m using the like function because the telephone numbers in the data set 
>> main contain prefixes, such as « + « ; e.g., « +3312224444 ».
>> 
>> Any suggestions would be welcome.
>> 
>> Many thanks.
>> 
>> Philippe
>> 
>> 
>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org 
>> <mailto:user-unsubscr...@spark.apache.org>
>> 
> 
> 
> -- 
> -- Anastasios Zouzias
>  <mailto:a...@zurich.ibm.com>

Reply via email to