Fwd: [pyspark][MLlib] Getting WARN FPGrowth: Input data is not cached for cached data

2017-12-21 Thread Anu B Nair
Hi, Following is my pyspark code, (attached input sample_fpgrowth.txt and python code along with this mail. Even after I have done cache, I am getting Warning: Input data is not cached. *from pyspark.mllib.fpm import FPGrowthimport pysparkfrom pyspark.context import

Re: Can spark shuffle leverage Alluxio to abtain higher stability?

2017-12-21 Thread vincent gromakowski
If not resilient at spark level, can't you just relaunch you job with your orchestration tool ? Le 21 déc. 2017 09:34, "Georg Heiler" a écrit : > Die you try to use the yarn Shuffle Service? > chopinxb schrieb am Do. 21. Dez. 2017 um 04:43: > >>

Re: Spark Streaming to REST API

2017-12-21 Thread ashish rawat
Sorry, for not making it explicit. We are using Spark Streaming as the streaming solution and I was wondering if it is a common pattern to do per tuple redis read/write and write to a REST API through Spark Streaming. Regards, Ashish On Fri, Dec 22, 2017 at 4:00 AM, Gourav Sengupta

Re: Spark Streaming to REST API

2017-12-21 Thread Gourav Sengupta
hi Ashish, I was just wondering if there is any particular reason why you are posting this to a SPARK group? Regards, Gourav On Thu, Dec 21, 2017 at 8:32 PM, ashish rawat wrote: > Hi, > > We are working on a streaming solution where multiple out of order streams > are

Spark Streaming to REST API

2017-12-21 Thread ashish rawat
Hi, We are working on a streaming solution where multiple out of order streams are flowing in the system and we need to join the streams based on a unique id. We are planning to use redis for this, where for every tuple, we will lookup if the id exists, we join if it does or else put the tuple

Re: Anyone know where to find independent contractors in New York?

2017-12-21 Thread x x
Try https://www.fiverr.com/ > On Dec 21, 2017, at 12:31 PM, Stephen Boesch wrote: > > Hi Richard, this is not a jobs board: please only discuss spark application > development issues. > > 2017-12-21 8:34 GMT-08:00 Richard L. Burton III

Re: Anyone know where to find independent contractors in New York?

2017-12-21 Thread Stephen Boesch
Hi Richard, this is not a jobs board: please only discuss spark application development issues. 2017-12-21 8:34 GMT-08:00 Richard L. Burton III : > I'm trying to locate four independent contractors who have experience with > Spark. I'm not sure where I can go to find

Anyone know where to find independent contractors in New York?

2017-12-21 Thread Richard L. Burton III
I'm trying to locate four independent contractors who have experience with Spark. I'm not sure where I can go to find experienced Spark consultants. Please, no recruiters. -- -Richard L. Burton III

Re: Reading data from OpenTSDB or KairosDB

2017-12-21 Thread Jörn Franke
There are datasource for Cassandra and hbase, however I am not sure how useful they are, because then you need to do also implement the logic of opentsdb or kairosdb. Better to implement your own data sources. Then, there are several projects enabling timeseries queries in Spark, but I am not

Reading data from OpenTSDB or KairosDB

2017-12-21 Thread marko
Hello everyone, I would like to know whether there is a way to read time-series data from OpenTSDB (built on top of HBase) or KairosDB (built on top of Cassandra) to a Spark DataFrame (or RDD's) ? Best regards, Marko - To

Re: Can spark shuffle leverage Alluxio to abtain higher stability?

2017-12-21 Thread Georg Heiler
Die you try to use the yarn Shuffle Service? chopinxb schrieb am Do. 21. Dez. 2017 um 04:43: > In my practice of spark application(almost Spark-SQL) , when there is a > complete node failure in my cluster, jobs which have shuffle blocks on the > node will completely fail