How to add spark structured streaming kafka source receiver

2019-08-09 Thread zenglong chen
I have set the " maxOffsetsPerTrigger",but it still receive one partition per trigger on micro-batch mode.So where to set receiving on 10 partitions parallel like what is Spark Streaming doing?

Using StreamingQueryListener.OnTerminate for Kafka Offset restore

2019-08-09 Thread Sandish Kumar HN
Hey Everyone, I'm using Spark StreamingQueryListener in Structured Streaming App Whenever I see an OffsetOutOfRangeException's in Spark Job Inside StreamingQueryListener.onTerminated method I'm updating the Spark checkpoint directory offsets. I was able to parse all OffsetOutOfRangeException's

Re: Dataset -- Schema for type scala.collection.Set[scala.Int] is not supported

2019-08-09 Thread Mohit Jaggi
switched to immutable.Set and it works. this is weird as the code in ScalaReflection.scala seems to support scala.collection.Set cc: dev list, in case this is a bug On Thu, Aug 8, 2019 at 8:41 PM Mohit Jaggi wrote: > Is this not supported? I found this diff >

Re: Unable to write data from Spark into a Hive Managed table

2019-08-09 Thread Mich Talebzadeh
Check your permissioning. Can you do insert select from external table into Hive managed table created by spark? // // Need to create and populate target ORC table transactioncodes_ll in database accounts.in Hive // HiveContext.sql("use accounts") // // Drop and create table transactioncodes_ll

Unable to write data from Spark into a Hive Managed table

2019-08-09 Thread Debabrata Ghosh
Hi , I am using Hortonworks Data Platform 3.1. I am unable to write data from Spark into a Hive Managed table but am able to do so in a Hive External table. Would you please help get me with a resolution. Thanks, Debu

Re: Spark SQL reads all leaf directories on a partitioned Hive table

2019-08-09 Thread Hao Ren
Hi Mich, Thank you for your reply. I need to be more clear about the environment. I am using spark-shell to run the query. Actually, the query works even without core-site, hdfs-site being under $SPARK_HOME/conf. My problem is efficiency. Because all of the partitions was scanned instead of the