How to use Spark Streaming from an HTTP api?

2014-08-18 Thread bumble123
I want to send an HTTP request (specifically to OpenTSDB) to get data. I've been looking at the StreamingContext api and don't seem to see any methods that can connect to this. Has anyone tried connecting Spark Streaming to a server via HTTP requests before? How have you done it? -- View this

Getting percentile from Spark Streaming?

2014-08-13 Thread bumble123
Hi, I'm trying to figure out how to constantly update, say, the 95th percentile of a set of data through Spark Streaming. I'm not sure how to order the dataset though, and while I can find percentiles in regular Spark, I can't seem to figure out how to get that to transfer over to Spark

Re: Use SparkStreaming to find the max of a dataset?

2014-08-08 Thread bumble123
Do you know how I might do a percentile then? I can't figure out how to order my data and count it so that I can calculate and get to the percentile. -- View this message in context:

Re: Use SparkStreaming to find the max of a dataset?

2014-08-08 Thread bumble123
Also, I tried that code and I keep getting this error: console:26: error: overloaded method value max with alternatives: (x$1: Double,x$2: Double)Double and (x$1: Float,x$2: Float)Float and (x$1: Long,x$2: Long)Long and (x$1: Int,x$2: Int)Int cannot be applied to (String, Int.type)

Re: Use SparkStreaming to find the max of a dataset?

2014-08-08 Thread bumble123
Just realized that my dStream was being inputted as a String stream. I'm trying to use the textSocketStream but the .toInt method doesn't seem to be working. Is there another way to get a numerical stream from a socket? -- View this message in context:

Re: Use SparkStreaming to find the max of a dataset?

2014-08-08 Thread bumble123
Figured it out! Just mapped it to a .toInt version of itself. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkStreaming-to-find-the-max-of-a-dataset-tp11734p11812.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Use SparkStreaming to find the max of a dataset?

2014-08-07 Thread bumble123
I can't figure out how to use Spark Streaming to find the max of a 5 second batch of data and keep updating the max every 5 seconds. How would I do this? -- View this message in context:

Re: How to read from OpenTSDB using PySpark (or Scala Spark)?

2014-08-05 Thread bumble123
Thank you!! Could you give me any sample code for the receiver? I'm still new to Spark and not quite sure how I would do that. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-read-from-OpenTSDB-using-PySpark-or-Scala-Spark-tp11211p11454.html Sent

How to read from OpenTSDB using PySpark (or Scala Spark)?

2014-08-01 Thread bumble123
Hi, I've seen many threads about reading from HBase into Spark, but none about how to read from OpenTSDB into Spark. Does anyone know anything about this? I tried looking into it, but I think OpenTSDB saves its information into HBase using hex and I'm not sure how to interpret the data. If you

Re: How to read from OpenTSDB using PySpark (or Scala Spark)?

2014-08-01 Thread bumble123
I'm trying to get metrics out of TSDB so I can use Spark to do anomaly detection on graphs. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-read-from-OpenTSDB-using-PySpark-or-Scala-Spark-tp11211p11232.html Sent from the Apache Spark User List

Re: How to read from OpenTSDB using PySpark (or Scala Spark)?

2014-08-01 Thread bumble123
So is there no way to do this through SparkStreaming? Won't I have to do batch processing if I use the http api rather than getting it directly into Spark? -- View this message in context:

Spark job finishes then command shell is blocked/hangs?

2014-07-31 Thread bumble123
Hi, My spark job finishes with this output: 14/07/31 16:33:25 INFO SparkContext: Job finished: count at RetrieveData.scala:18, took 0.013189 s However, the command line doesn't go back to normal and instead just hangs. This is my first time running a spark job - is this normal? If not, how do I