Re: Non string type partitions

2023-04-15 Thread Charles vinodh
bumping this up again for suggestions?.. Is the official recommendation to not have *int* or *date* typed partition columns? On Wed, 12 Apr 2023 at 10:44, Charles vinodh wrote: > There are other distributed execution engines (like hive, trino) that do > support non-string data

Re: Non string type partitions

2023-04-12 Thread Charles vinodh
ory cannot be an object, it has to be a > string to create partitioned dirs like "date=2023-04-10" > > On Tue, 11 Apr, 2023, 8:27 pm Charles vinodh, > wrote: > >> >> Hi Team, >> >> We are running into the below error when we are trying t

Non string type partitions

2023-04-11 Thread Charles vinodh
Hi Team, We are running into the below error when we are trying to run a simple query a partitioned table in Spark. *MetaException(message:Filtering is supported only on partition keys of type string) * Our the partition column has been to type *date *instead of string and query is a very

Re: Convert each partition of RDD to Dataframe

2020-02-27 Thread Charles vinodh
Just split the single rdd into multiple individual rdds using a filter operation and then convert each individual rdds to it's respective dataframe.. On Thu, Feb 27, 2020, 7:29 AM Manjunath Shetty H wrote: > > Hello All, > > In spark i am creating the custom partitions with Custom RDD, each >

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Charles vinodh
process (let's call them >> x and y) >> 2. If these offsets have fallen out of the retention period, Spark will >> try to set the offset to x which is less than z > y > x. >> 3. Since z > y, Spark will not process any of the data >> 4. Goto 1 >> >>

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Charles vinodh
if below option is not set. > > Set failOnDataLoss=true option to see failures. > > On Wed, Sep 11, 2019 at 3:24 PM Charles vinodh > wrote: > >> The only form of rate limiting I have set is *maxOffsetsPerTrigger *and >> *fetch.message.max.bytes. * >> >> *"

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Charles vinodh
, Sep 11, 2019 at 2:39 PM Charles vinodh > wrote: > >> >> Hi, >> >> I am trying to run a spark application ingesting data from Kafka using >> the Spark structured streaming and the spark library >> org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.1. I am

Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Charles vinodh
Hi, I am trying to run a spark application ingesting data from Kafka using the Spark structured streaming and the spark library org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.1. I am facing a very weird issue where during execution of all my micro-batches the Kafka consumer is not able to fetch