Have you seen the thread 'Filter on a column having multiple values' where
Michael gave this example ?

https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1023043053387187/1075277772969592/2840265927289860/2388bac36e.html

FYI

On Wed, Mar 2, 2016 at 9:33 PM, Angel Angel <areyouange...@gmail.com> wrote:

> Hello Sir/Madam,
>
> I am writing one application using spark sql.
>
> i made the vary big table using the following command
>
> *val dfCustomers1 =
> sc.textFile("/root/Desktop/database.txt").map(_.split(",")).map(p =>
> Customer1(p(0),p(1).trim.toInt, p(2).trim.toInt, p(3)))toDF*
>
>
> Now i want to search the address(many address)  fields in the table and
> then extends the new table as per the searching.
>
> *var k = dfCustomers1.filter(dfCustomers1("Address").equalTo(lines(0)))*
>
>
>
> *for( a <-1 until 1500) {*
>
> *     | var temp=
> dfCustomers1.filter(dfCustomers1("Address").equalTo(lines(a)))*
>
> *     |  k = temp.unionAll(k)*
>
> *}*
>
> *k.show*
>
>
>
>
> But this is taking so long time. So can you suggest me the any optimized
> way, so i can reduce the execution time.
>
>
> My cluster has 3 slaves and 1 master.
>
>
> Thanks.
>

Reply via email to