Spark structured streaming time series forecasting

2018-01-08 Thread Bogdan Cojocar
Hello,

Is there a method to do time series forecasting in spark structured
streaming? Is there any integration going on with spark-ts or a similar
library?

Many thanks,
Bogdan Cojocar


Different behaviour when querying a spark DataFrame from dynamodb

2017-12-13 Thread Bogdan Cojocar
I am reading some data in a dataframe from a dynamo db table:

val data = spark.read.dynamodb("table")
data.filter($"field1".like("%hello%")).createOrReplaceTempView("temp")
spark.sql("select * from temp").show()

When I do the last statement I get results. If however I try to do:

spark.sql("select field2 from temp").show()

I get no results. The dataframe has the structure:

root
 |-- field1: string (nullable = true)
 |-- field2: string (nullable = true)
 |-- field3: string (nullable = true)
 |-- field4: long (nullable = true)
 |-- field5: string (nullable = true)

Dependencies:

spark 2.2.0
scala 2.11.8
spark-dynamodb 0.0.11

Spark running on local[*]


Spark Structured Streaming how to read data from AWS SQS

2017-12-11 Thread Bogdan Cojocar
For spark streaming there are connectors
<https://github.com/imapi/spark-sqs-receiver> that can achieve this
functionality.

Unfortunately for spark structured streaming I couldn't find any as it's a
newer technology. Is there a way to connect to a source using a spark
streaming connector? Or is there a way to create a custom connector similar
to the way one can be created in a spark streaming
<http://spark.apache.org/docs/latest/streaming-custom-receivers.html>
 application?


Many thanks,

Bogdan Cojocar