Re: Cassandra querying time stamps

sujeet jog Tue, 20 Jun 2017 04:58:11 -0700

, Below is the query, looks like from physical plan, the query is same as
that of cqlsh,


 val query = s"""(select * from model_data
        where TimeStamp > \'$timeStamp+0000\' and TimeStamp <=
\'$startTS+0000\'
        and MetricID = $metricID)"""

    println("Model query" + query)

    val df = spark.read
      .format(Config.dbDriver)
      .options(Map("Keyspace" -> Config.dbName, "table" ->
"ml_forecast_tbl"))
      .load

    val d = spark.sparkContext.cassandraTable("analytics_db",
"ml_forecast_tbl")
     .where(" \"TimeStamp\" > ? and \"TimeStamp\" <= ? and \"MetricID\" =
1",
      timeStamp + "+0000", startTS + "+0000")


   d.createorReplaceTempView("mode_data")
   val modelDF = spark.sql(query).cache.
   println(spark.sql(query).queryExecution)



== Physical Plan ==
InMemoryTableScan [MetricID#9045, TimeStamp#9046, ResourceID#9047,
Forecast#9048, GlobalThresholdMax#9049, GlobalThresholdMin#9050, Hi85#9051,
Hi99#9052, Low85#9053, Low99#9054]
:  +- InMemoryRelation [MetricID#9045, TimeStamp#9046, ResourceID#9047,
Forecast#9048, GlobalThresholdMax#9049, GlobalThresholdMin#9050, Hi85#9051,
Hi99#9052, Low85#9053, Low99#9054], true, 10000, StorageLevel(disk, memory,
deserialized, 1 replicas)
:     :  +- *Filter ((((cast(TimeStamp#9046 as string) > 2016-01-22
00:00:00+0000) && (cast(TimeStamp#9046 as string) <= 2016-01-22
00:30:00+0000)) && isnotnull(TimeStamp#9046)) && isnotnull(MetricID#9045))
:     :     +- *Scan
org.apache.spark.sql.cassandra.CassandraSourceRelation@40dc2ade
[MetricID#9045,TimeStamp#9046,ResourceID#9047,Forecast#9048,GlobalThresholdMax#9049,GlobalThresholdMin#9050,Hi85#9051,Hi99#9052,Low85#9053,Low99#9054]
PushedFilters: [IsNotNull(TimeStamp), IsNotNull(MetricID),
EqualTo(MetricID,1)], ReadSchema:
struct<MetricID:int,TimeStamp:timestamp,ResourceID:string,Forecast:double,GlobalThresholdMax:doub...


On Tue, Jun 20, 2017 at 5:13 PM, Riccardo Ferrari <[email protected]>
wrote:

> Hi,
>
> Personally I would inspect how dates are managed. How does your spark code
> looks like? What does the explain say. Does TimeStamp gets parsed the same
> way?
>
> Best,
>
> On Tue, Jun 20, 2017 at 12:52 PM, sujeet jog <[email protected]> wrote:
>
>> Hello,
>>
>> I have a table as below
>>
>> CREATE TABLE analytics_db.ml_forecast_tbl (
>>    "MetricID" int,
>>    "TimeStamp" timestamp,
>>    "ResourceID" timeuuid
>>    "Value"   double,
>>     PRIMARY KEY ("MetricID", "TimeStamp", "ResourceID")
>> )
>>
>> select * from ml_forecast_tbl where "MetricID" = 1 and "TimeStamp" >
>> '2016-01-22 00:00:00+0000' and "TimeStamp" <= '2016-01-22 00:30:00+0000' ;
>>
>>  MetricID | TimeStamp                       | ResourceID
>>           | Value|
>> ----------+---------------------------------+---------------
>> -----------------------+----------+
>> |        1 | 2016-01-22 00:30:00.000000+0000 |
>> 4a925190-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>> | 23.74124 | 15.2371
>>         1 | 2016-01-22 00:30:00.000000+0000 |
>> 4a92c6c0-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>> | 23.74124 | 15.2371
>>         1 | 2016-01-22 00:30:00.000000+0000 |
>> 4a936300-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>> | 23.74124 | 15.2371
>>         1 | 2016-01-22 00:30:00.000000+0000 |
>> 4a93d830-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>> | 23.74124 | 15.2371
>>
>> This query runs perfectly fine from cqlsh,   but not with Spark SQL, it
>> just emits empty results,
>> Is there a catch to think about on querying timestamp ranges with
>> cassandra spark connector
>>
>> Any inputs on this ?..
>>
>>
>> Thanks,
>> Sujeet
>>
>
>

Re: Cassandra querying time stamps

Reply via email to