Spark Job Fails while writing data to a S3 location in Parquet

2024-08-18 Thread Nipuna Shantha
n(Main.scala:11) > at Main.main(Main.scala) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:569) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at org.apache.spark.deploy.SparkSubmit.org > $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1066) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1158) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1167) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Seeking for support to help above issues. Thank You, Best Regards, Nipuna Shantha

Dynamic value as the offset of lag() function

2023-05-23 Thread Nipuna Shantha
Hi all This is the sample set of data that I used for this task ,[image: image.png] My need is to pass count as the offset of lag() function. *[ lag(col(), lag(count)).over(windowspec) ]* But as the lag function expects lag(Column, Int) above code does not work. So can you guys suggest a method

Incremental Value dependents on another column of Data frame Spark

2023-05-23 Thread Nipuna Shantha
Hi all, This is the sample set of data that I used for this task [image: image.png] My expected output is as below [image: image.png] My scenario is if Type is M01 the count should be 0 and if Type is M02 it should be incremented from 1 or 0 until the sequence of M02 is finished. Imagine this

RE: Encoded data retrieved when reading Parquet file

2022-10-19 Thread Nipuna Shantha
Solved this issue. Thank You On 2022/10/19 05:26:51 Nipuna Shantha wrote:> Hi all,> > I am writing to a parquet file using Impala version 3.4.0 and try to read> same parquet file from Spark 3.3.0 to a DataFrame.> > var df = spark.read.parquet(parquet_file_name)> > But when

Encoded data retrieved when reading Parquet file

2022-10-19 Thread Nipuna Shantha
; it shows [54 41 58] For string "LAPTOP" it shows [4C 41 50 54 4F 50] But when read that same parquet file from impala it has no issues. Can anyone suggest me a method to overcome this problem? Thank you, Best regards, Nipuna Shantha