Many thanks, Mich. Is « foreach » the best construct to lookup items is a dataset such as the below « telephonedirectory » data set?
val telrdd = spark.sparkContext.parallelize(Seq(« tel1 » , « tel2 » , « tel3 » …)) // the telephone sequence // was read for a CSV file val ds = spark.read.parquet(« /path/to/telephonedirectory » ) rdd .foreach(tel => { longAcc.select(« * » ).rlike(« + » + tel) }) > Le 1 avr. 2023 à 22:36, Mich Talebzadeh <mich.talebza...@gmail.com> a écrit : > > This may help > > Spark rlike() Working with Regex Matching Example > <https://sparkbyexamples.com/spark/spark-rlike-regex-matching-examples/>s > Mich Talebzadeh, > Lead Solutions Architect/Engineering Lead > Palantir Technologies Limited > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > https://en.everybodywiki.com/Mich_Talebzadeh > > > Disclaimer: Use it at your own risk. Any and all responsibility for any loss, > damage or destruction of data or any other property which may arise from > relying on this email's technical content is explicitly disclaimed. The > author will in no case be liable for any monetary damages arising from such > loss, damage or destruction. > > > > On Sat, 1 Apr 2023 at 19:32, Philippe de Rochambeau <phi...@free.fr > <mailto:phi...@free.fr>> wrote: >> Hello, >> I’m looking for an efficient way in Spark to search for a series of >> telephone numbers, contained in a CSV file, in a data set column. >> >> In pseudo code, >> >> for tel in [tel1, tel2, …. tel40,000] >> search for tel in dataset using .like(« %tel% ») >> end for >> >> I’m using the like function because the telephone numbers in the data set >> main contain prefixes, such as « + « ; e.g., « +3312224444 ». >> >> Any suggestions would be welcome. >> >> Many thanks. >> >> Philippe >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> <mailto:user-unsubscr...@spark.apache.org> >>