Facing weird problem while reading Parquet

2021-08-10 Thread Prateek Rajput
parquet and we are doing simple write and read. for writing - *ds.write().parquet(outputPath); // this is writing 40K part files* for reading - *sqlContext.read().parquet(inputPath).javaRDD() // here we are trying to read same 40K part files* *Regards,* *Prateek Rajput

Re: Getting EOFFileException while reading from sequence file in spark

2019-05-03 Thread Prateek Rajput
Hi all, Please share if anyone have faced the same problem. There are many similar issues on web but I did not find any solution and reason why this happens. It will be really helpful. Regards, Prateek On Mon, Apr 29, 2019 at 3:18 PM Prateek Rajput wrote: > I checked and removed 0 sized fi

Re: How to specify number of Partition using newAPIHadoopFile()

2019-04-30 Thread Prateek Rajput
On Tue, Apr 30, 2019 at 6:48 PM Vatsal Patel wrote: > *Issue: * > > When I am reading sequence file in spark, I can specify the number of > partitions as an argument to the API, below is the way > *public JavaPairRDD sequenceFile(String path, Class > keyClass, Class valueClass, int

Re: Getting EOFFileException while reading from sequence file in spark

2019-04-29 Thread Prateek Rajput
no such issue is coming it is happening in case of spark only. On Mon, Apr 29, 2019 at 2:50 PM Deepak Sharma wrote: > This can happen if the file size is 0 > > On Mon, Apr 29, 2019 at 2:28 PM Prateek Rajput > wrote: > >> Hi guys, >> I am getting this strange error again and ag

Getting EOFFileException while reading from sequence file in spark

2019-04-29 Thread Prateek Rajput
Hi guys, I am getting this strange error again and again while reading from from a sequence file in spark. User class threw exception: org.apache.spark.SparkException: Job aborted. at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100) at