Re: Spark reading from cassandra

2020-11-04 Thread Russell Spitzer
A where clause with a PK restriction should be identified by the Connector
and transformed into a single request. This should still be much slower
than doing the request directly but still much much faster than a full scan.

On Wed, Nov 4, 2020 at 12:51 PM Russell Spitzer 
wrote:

> Yes, the "Allow filtering" part isn't actually important other than for
> letting the query run in the first place. A where clause that utilizes a
> clustering column restriction will perform much better than a full scan,
> column pruning as well can be extremely beneficial.
>
> On Wed, Nov 4, 2020 at 11:12 AM Amit Sharma  wrote:
>
>> Hi, i have a question while we are reading from cassandra should we use
>> partition key only in where clause from performance perspective or it does
>> not matter from spark perspective because it always allows filtering.
>>
>>
>> Thanks
>> Amit
>>
>


Re: Spark reading from cassandra

2020-11-04 Thread Russell Spitzer
Yes, the "Allow filtering" part isn't actually important other than for
letting the query run in the first place. A where clause that utilizes a
clustering column restriction will perform much better than a full scan,
column pruning as well can be extremely beneficial.

On Wed, Nov 4, 2020 at 11:12 AM Amit Sharma  wrote:

> Hi, i have a question while we are reading from cassandra should we use
> partition key only in where clause from performance perspective or it does
> not matter from spark perspective because it always allows filtering.
>
>
> Thanks
> Amit
>


Spark reading from cassandra

2020-11-04 Thread Amit Sharma
Hi, i have a question while we are reading from cassandra should we use
partition key only in where clause from performance perspective or it does
not matter from spark perspective because it always allows filtering.


Thanks
Amit