Stream is corrupted in ShuffleBlockFetcherIterator

2019-08-15 Thread Mikhail Pryakhin
Hello, Spark community! I've been struggling with my job which constantly fails due to inability to uncompress some previously compressed blocks while shuffling data. I use spark 2.2.0 with all the configuration settings left by default (no specific compression codec is specified). I've ascerta

Call Oracle Sequence using Spark

2019-08-15 Thread rajat kumar
Hi All, I have to call Oracle sequence using spark. Can you pls tell what is the way to do that? Thanks Rajat

Memory Limits error

2019-08-15 Thread Dennis Suhari
Hi community, I am using Spark on Yarn. When submiting a job after a long time I get an error mesage and retry. It happens when I want to store the dataframe to a table. spark_df.write.option("path", "/nlb_datalake/golden_zone/webhose/sentiment").saveAsTable("news_summary_test", mode="overwri

Spark streaming kafka source delay occasionally

2019-08-15 Thread ans
using kafka consumer, 2 mins batch, tasks process take 2 ~ 5 seconds in general, but a part of tasks take more than 40 seconds. I guess *CachedKafkaConsumer#poll* could be problem. private def poll(timeout: Long): Unit = { val p = consumer.poll(timeout) val r = p.records(topicPartition)