unsubscribe

2017-08-19 Thread Charles O. Bajomo
unsubscribe Charles Bajomo Operations Director www.cloudxtiny.co.uk | Precision Technology Consulting Ltd Registered England & Wales : 07397178 VAT No. : 124 4354 38 GB

Re: Is there a way to tell if a receiver is a Reliable Receiver?

2017-04-17 Thread Charles O. Bajomo
The easiest way I found was to take a look at the source. Any receiver that calls the version of store that requires an iterator is considered reliable. A definitive list would be nice. Kind Regards - Original Message - From: "Justin Pihony" To: "user"

[Spark Streamiing] Streaming job failing consistently after 1h

2017-03-05 Thread Charles O. Bajomo
Hello all, I have a strange behaviour I can't understand. I have a streaming job using a custom java receiver that pull data from a jms queue that I process and then write to HDFS as parquet and avro files. For some reason my job keeps failing after 1hr and 30 minutes. When It fails I get an

[Spark] Accumulators or count()

2017-03-01 Thread Charles O. Bajomo
Hello everyone, I wanted to know if there is any benefit to using an acculumator over just executing a count() on the whole RDD. There seems to be a lot of issues with accumulator during a stage failure and also seems to be an issue rebuilding them if the application restarts from a

Re: 答复: spark append files to the same hdfs dir issue for LeaseExpiredException

2017-02-28 Thread Charles O. Bajomo
Unless this is a managed hive table I would expect you can just MSCK REPAIR the table to get the new partition. of course you will need to change the schema to reflect the new partition Kind Regards From: "Triones,Deng(vip.com)" <triones.d...@vipshop.com> To: &q

Re: spark append files to the same hdfs dir issue for LeaseExpiredException

2017-02-28 Thread Charles O. Bajomo
I see this problem as well with the _temporary directory but from what I have been able to gather, there is no way around it in that situation apart from making sure all reducers write to different folders. In the past I partitioned by executor id. I don't know if this is the best way though.

[Spark Streaming] Batch versus streaming

2017-02-23 Thread Charles O. Bajomo
Hello, I am reading data from a JMS queue and I need to prevent any data loss so I have custom java receiver that only acks messages once they have been stored. Sometimes my program crashes because I can't control the flow rate from the queue and it overwhelms the job and I end up losing

[Spark Streaming WAL] custom java streaming receiver and the WAL

2017-02-15 Thread Charles O. Bajomo
Hello all, I am having some problems with my custom java based receiver. I am running Spark 1.5.0 and I used the template on the spark website (http://spark.apache.org/docs/1.0.0/streaming-custom-receivers.html). Basically my receiver listens to a JMS queue (Solace) and then based on the size