Correction.
On Tue, Jun 20, 2017 at 5:27 PM, sujeet jog <[email protected]> wrote:
> , Below is the query, looks like from physical plan, the query is same as
> that of cqlsh,
>
> val query = s"""(select * from model_data
> where TimeStamp > \'$timeStamp+0000\' and TimeStamp <=
> \'$startTS+0000\'
> and MetricID = $metricID)"""
>
> println("Model query" + query)
>
> val df = spark.read
> .format(Config.dbDriver)
> .options(Map("Keyspace" -> Config.dbName, "table" ->
> "ml_forecast_tbl"))
> .load
>
>
> df.createorReplaceTempView("mode_data")
> val modelDF = spark.sql(query).cache.
> println(spark.sql(query).queryExecution)
>
>
>
> == Physical Plan ==
> InMemoryTableScan [MetricID#9045, TimeStamp#9046, ResourceID#9047,
> Forecast#9048, GlobalThresholdMax#9049, GlobalThresholdMin#9050, Hi85#9051,
> Hi99#9052, Low85#9053, Low99#9054]
> : +- InMemoryRelation [MetricID#9045, TimeStamp#9046, ResourceID#9047,
> Forecast#9048, GlobalThresholdMax#9049, GlobalThresholdMin#9050, Hi85#9051,
> Hi99#9052, Low85#9053, Low99#9054], true, 10000, StorageLevel(disk, memory,
> deserialized, 1 replicas)
> : : +- *Filter ((((cast(TimeStamp#9046 as string) > 2016-01-22
> 00:00:00+0000) && (cast(TimeStamp#9046 as string) <= 2016-01-22
> 00:30:00+0000)) && isnotnull(TimeStamp#9046)) && isnotnull(MetricID#9045))
> : : +- *Scan org.apache.spark.sql.cassandra.
> CassandraSourceRelation@40dc2ade [MetricID#9045,TimeStamp#9046,
> ResourceID#9047,Forecast#9048,GlobalThresholdMax#9049,
> GlobalThresholdMin#9050,Hi85#9051,Hi99#9052,Low85#9053,Low99#9054]
> PushedFilters: [IsNotNull(TimeStamp), IsNotNull(MetricID),
> EqualTo(MetricID,1)], ReadSchema: struct<MetricID:int,TimeStamp:
> timestamp,ResourceID:string,Forecast:double,GlobalThresholdMax:doub...
>
>
> On Tue, Jun 20, 2017 at 5:13 PM, Riccardo Ferrari <[email protected]>
> wrote:
>
>> Hi,
>>
>> Personally I would inspect how dates are managed. How does your spark
>> code looks like? What does the explain say. Does TimeStamp gets parsed the
>> same way?
>>
>> Best,
>>
>> On Tue, Jun 20, 2017 at 12:52 PM, sujeet jog <[email protected]>
>> wrote:
>>
>>> Hello,
>>>
>>> I have a table as below
>>>
>>> CREATE TABLE analytics_db.ml_forecast_tbl (
>>> "MetricID" int,
>>> "TimeStamp" timestamp,
>>> "ResourceID" timeuuid
>>> "Value" double,
>>> PRIMARY KEY ("MetricID", "TimeStamp", "ResourceID")
>>> )
>>>
>>> select * from ml_forecast_tbl where "MetricID" = 1 and "TimeStamp" >
>>> '2016-01-22 00:00:00+0000' and "TimeStamp" <= '2016-01-22 00:30:00+0000' ;
>>>
>>> MetricID | TimeStamp | ResourceID
>>> | Value|
>>> ----------+---------------------------------+---------------
>>> -----------------------+----------+
>>> | 1 | 2016-01-22 00:30:00.000000+0000 |
>>> 4a925190-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>>> | 23.74124 | 15.2371
>>> 1 | 2016-01-22 00:30:00.000000+0000 |
>>> 4a92c6c0-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>>> | 23.74124 | 15.2371
>>> 1 | 2016-01-22 00:30:00.000000+0000 |
>>> 4a936300-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>>> | 23.74124 | 15.2371
>>> 1 | 2016-01-22 00:30:00.000000+0000 |
>>> 4a93d830-3b13-11e7-83c6-a32261219d03 | 32.16177 |
>>> | 23.74124 | 15.2371
>>>
>>> This query runs perfectly fine from cqlsh, but not with Spark SQL, it
>>> just emits empty results,
>>> Is there a catch to think about on querying timestamp ranges with
>>> cassandra spark connector
>>>
>>> Any inputs on this ?..
>>>
>>>
>>> Thanks,
>>> Sujeet
>>>
>>
>>
>