Re: How can I use backticks in column names?

2022-12-05 Thread Bjørn Jørgensen
df = spark.createDataFrame( [("china", "asia"), ("colombia", "south america`")], ["country", "continent`"] ) df.show() ++--+ | country|continent`| ++--+ | china| asia| |colombia|south america`| ++--+

Re: [PySpark] Reader/Writer for bgzipped data

2022-12-05 Thread Chris Nauroth
Sorry, I misread that in the original email. This is my first time looking at bgzip. I see from the documentation that it is putting some additional framing around gzip and producing a series of small blocks, such that you can create an index of the file and decompress individual blocks instead

Re: [PySpark] Reader/Writer for bgzipped data

2022-12-05 Thread Oliver Ruebenacker
Hello, Thanks for the response, but I mean compressed with bgzip , not bzip2. Best, Oliver On Fri, Dec 2, 2022 at 4:44 PM Chris Nauroth wrote: > Hello Oliver, > > Yes, Spark makes this possible using the Hadoop compression codecs and the >