Re: Filtering JSON records when there isn't an exact schema match in Spark

2023-07-03 Thread Vikas Kumar
Have you tried dropmalformed option ? On Mon, Jul 3, 2023, 1:34 PM Shashank Rao wrote: > Update: Got it working by using the *_corrupt_record *field for the first > case (record 4) > > schema = schema.add("_corrupt_record", DataTypes.StringType); > Dataset ds =

Update nested struct with null fields

2023-02-17 Thread Vikas Kumar
Hi, I have a nested StructType. The StructType is deeply nested and may comprise other Structs. Now I want to update this struct at the lowest level. I tried withField but it doesn't work if any of the top level struct is null. I will appreciate any help with this. The example schema is: val

Re: [Spark Structured Streaming] Do spark structured streaming is support sink to AWS Kinesis currently?

2023-02-16 Thread Vikas Kumar
Doesn't directly answer your question but there are ways in scala and pyspark - See if this helps: https://repost.aws/questions/QUP_OJomilTO6oIgvK00VHEA/writing-data-to-kinesis-stream-from-py-spark On Thu, Feb 16, 2023, 8:27 PM hueiyuan su wrote: > *Component*: Spark Structured Streaming >

Re: How to explode array columns of a dataframe having the same length

2023-02-16 Thread Vikas Kumar
I think these 4 steps should help: Use zip Explode Withcolumn (getelement of array) Drop the array column Thanks On Thu, Feb 16, 2023, 2:18 PM sam smith wrote: > @Enrico Minack I used arrays_zip to merge values > into one row, and then used toJSON() to export the data. > @Bjørn

Unsubscribe

2021-07-06 Thread Vikas Kumar
Unsubscribe

Unsubscribe

2020-12-10 Thread Vikas Kumar
Unsubscribe