subject:"Reading 7z file in spark"

Re: Reading 7z file in spark

2020-01-14 Thread Andrew Melo

It only makes sense if the underlying file is also splittable, and even then, it doesn't really do anything for you if you don't explicitly tell spark about the split boundaries On Tue, Jan 14, 2020 at 7:36 PM Someshwar Kale wrote: > I would suggest to use other compression technique which is

Re: Reading 7z file in spark

2020-01-14 Thread Someshwar Kale

I would suggest to use other compression technique which is splittable for eg. Bzip2, lzo, lz4. On Wed, Jan 15, 2020, 1:32 AM Enrico Minack wrote: > Hi, > > Spark does not support 7z natively, but you can read any file in Spark: > > def read(stream: PortableDataStream): Iterator[String] = { >

Re: Reading 7z file in spark

2020-01-14 Thread Enrico Minack

Hi, Spark does not support 7z natively, but you can read any file in Spark: def read(stream: PortableDataStream):Iterator[String] = {Seq(stream.getPath()).iterator } spark.sparkContext .binaryFiles("*.7z") .flatMap(file => read(file._2)) .toDF("path") .show(false) This scales with

Reading 7z file in spark

2020-01-13 Thread HARSH TAKKAR

Hi, Is it possible to read 7z compressed file in spark? Kind Regards Harsh Takkar

Re: Reading 7z file in spark

Re: Reading 7z file in spark

Re: Reading 7z file in spark

Reading 7z file in spark

4 matches

Site Navigation

Mail list logo

Footer information