Hi Imran,

It seems that you do not cache your underlying DataFrame. I would suggest
to force a cache with tweets.cache() and then tweets.count(). Let us know
if your problem persists.

Best,
Anastasios

On Wed, Jul 19, 2017 at 2:49 PM, Imran Rajjad <raj...@gmail.com> wrote:

> Greetings,
>
> We are trying out Spark 2 + ThriftServer to join multiple
> collections from a Solr Cloud (6.4.x). I have followed this blog
> https://lucidworks.com/2015/08/20/solr-spark-sql-datasource/
>
> I understand that initially spark populates the temporary table with 18633014
> records and takes its due time, however any following SQLs on the
> temporary table take the same amount of time . It seems the temporary
> tables is not being re-used or cached. The fields in the solr collection do
> not have the docValue enabled, could that be the reason? Apparently I have
> missed a trick
>
> regards,
> Imran
>
> --
> I.R
>



-- 
-- Anastasios Zouzias
<a...@zurich.ibm.com>

Reply via email to