Re: Cassandra querying time stamps

sujeet jog Tue, 20 Jun 2017 06:47:52 -0700

Correction.

On Tue, Jun 20, 2017 at 5:27 PM, sujeet jog <[email protected]> wrote:


> , Below is the query, looks like from physical plan, the query is same as
> that of cqlsh,
>
>  val query = s"""(select * from model_data
>         where TimeStamp > \'$timeStamp+0000\' and TimeStamp <=
> \'$startTS+0000\'
>         and MetricID = $metricID)"""
>
>     println("Model query" + query)
>
>     val df = spark.read
>       .format(Config.dbDriver)
>       .options(Map("Keyspace" -> Config.dbName, "table" ->
> "ml_forecast_tbl"))
>       .load
>
>
>    df.createorReplaceTempView("mode_data")
>    val modelDF = spark.sql(query).cache.
>    println(spark.sql(query).queryExecution)
>
>
>
> == Physical Plan ==
> InMemoryTableScan [MetricID#9045, TimeStamp#9046, ResourceID#9047,
> Forecast#9048, GlobalThresholdMax#9049, GlobalThresholdMin#9050, Hi85#9051,
> Hi99#9052, Low85#9053, Low99#9054]
> :  +- InMemoryRelation [MetricID#9045, TimeStamp#9046, ResourceID#9047,
> Forecast#9048, GlobalThresholdMax#9049, GlobalThresholdMin#9050, Hi85#9051,
> Hi99#9052, Low85#9053, Low99#9054], true, 10000, StorageLevel(disk, memory,
> deserialized, 1 replicas)
> :     :  +- *Filter ((((cast(TimeStamp#9046 as string) > 2016-01-22
> 00:00:00+0000) && (cast(TimeStamp#9046 as string) <= 2016-01-22
> 00:30:00+0000)) && isnotnull(TimeStamp#9046)) && isnotnull(MetricID#9045))
> :     :     +- *Scan org.apache.spark.sql.cassandra.
> CassandraSourceRelation@40dc2ade [MetricID#9045,TimeStamp#9046,
> ResourceID#9047,Forecast#9048,GlobalThresholdMax#9049,
> GlobalThresholdMin#9050,Hi85#9051,Hi99#9052,Low85#9053,Low99#9054]
> PushedFilters: [IsNotNull(TimeStamp), IsNotNull(MetricID),
> EqualTo(MetricID,1)], ReadSchema: struct<MetricID:int,TimeStamp:
> timestamp,ResourceID:string,Forecast:double,GlobalThresholdMax:doub...
>
>
> On Tue, Jun 20, 2017 at 5:13 PM, Riccardo Ferrari <[email protected]>
> wrote:
>
>> Hi,
>>
>> Personally I would inspect how dates are managed. How does your spark
>> code looks like? What does the explain say. Does TimeStamp gets parsed the
>> same way?
>>
>> Best,
>>
>> On Tue, Jun 20, 2017 at 12:52 PM, sujeet jog <[email protected]>
>> wrote:
>>
>>> Hello,
>>>
>>> I have a table as below
>>>
>>> CREATE TABLE analytics_db.ml_forecast_tbl (
>>>    "MetricID" int,
>>>    "TimeStamp" timestamp,
>>>    "ResourceID" timeuuid
>>>    "Value"   double,
>>>     PRIMARY KEY ("MetricID", "TimeStamp", "ResourceID")
>>> )
>>>
>>> select * from ml_forecast_tbl where "MetricID" = 1 and "TimeStamp" >
>>> '2016-01-22 00:00:00+0000' and "TimeStamp" <= '2016-01-22 00:30:00+0000' ;
>>>
>>>  MetricID | TimeStamp                       | ResourceID
>>>           | Value|
>>> ----------+---------------------------------+---------------
>>> -----------------------+----------+
>>> |        1 | 2016-01-22 00:30:00.000000+0000 |
>>> 4a925190-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>>> | 23.74124 | 15.2371
>>>         1 | 2016-01-22 00:30:00.000000+0000 |
>>> 4a92c6c0-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>>> | 23.74124 | 15.2371
>>>         1 | 2016-01-22 00:30:00.000000+0000 |
>>> 4a936300-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>>> | 23.74124 | 15.2371
>>>         1 | 2016-01-22 00:30:00.000000+0000 |
>>> 4a93d830-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>>> | 23.74124 | 15.2371
>>>
>>> This query runs perfectly fine from cqlsh,   but not with Spark SQL, it
>>> just emits empty results,
>>> Is there a catch to think about on querying timestamp ranges with
>>> cassandra spark connector
>>>
>>> Any inputs on this ?..
>>>
>>>
>>> Thanks,
>>> Sujeet
>>>
>>
>>
>

Re: Cassandra querying time stamps

Reply via email to