Hi all,
Found the answer from the following link:
https://forums.databricks.com/questions/918/how-to-set-size-of-parquet-output-files.html
I can successfully setup parquet block size with
spark.hadoop.parquet.block.size.
The following is the sample code:
# init
block_size = 512 * 1024
conf =
OK, will do.
On Fri, Mar 16, 2018 at 4:41 PM Sean Owen wrote:
> I think you can file a JIRA and open a PR. All of the bits that use "gpg
> ... SHA512 file ..." can use shasum instead.
> I would not change any existing release artifacts though.
>
> On Fri, Mar 16, 2018 at 1:14 PM Nicholas Chammas
I think you can file a JIRA and open a PR. All of the bits that use "gpg
... SHA512 file ..." can use shasum instead.
I would not change any existing release artifacts though.
On Fri, Mar 16, 2018 at 1:14 PM Nicholas Chammas
wrote:
> I have sha512sum on my Mac via Homebrew, but yeah as long as t
Ok and the recording is now being processed and will be posted at the same
URL once its done ( https://www.youtube.com/watch?v=pXzVtEUjrLc ). You can
also see a walk through with Cody merging his first PR (
https://www.youtube.com/watch?v=_SdNu7MezL4 ).
Since I had a slight problem during the liv
I have sha512sum on my Mac via Homebrew, but yeah as long as the format is
the same I suppose it doesn’t matter if we use shasum -a or sha512sum.
So shall I file a JIRA + PR for this? Or should I leave the PR to a
maintainer? And are we OK with updating all the existing release hashes to
use the n
Hi all,
Looks like it's parquet-specific issue.
I can successfully write with 512k block-size
if I use df.write.csv() or use df.write.text()
(I can successfully do csv write when I put hadoop-lzo-0.4.15-cdh5.13.0.jar
into the jars dir)
sample code:
block_size = 512 * 1024
conf =
SparkConf().s
I'm going to be doing another live stream code review today in ~5 minutes.
You can join watch at https://www.youtube.com/watch?v=pXzVtEUjrLc & the
result will be posted as well. In this review I'll look at PRs in both the
Spark project and a related project, spark-testing-base.
--
Twitter: https
+1 there
From: Sean Owen
Sent: Friday, March 16, 2018 9:51:49 AM
To: Felix Cheung
Cc: rb...@netflix.com; Nicholas Chammas; Spark dev list
Subject: Re: Changing how we compute release hashes
I think the issue with that is that OS X doesn't have "sha512sum". Both i
I think the issue with that is that OS X doesn't have "sha512sum". Both it
and Linux have "shasum -a 512" though.
On Fri, Mar 16, 2018 at 11:05 AM Felix Cheung
wrote:
> Instead of using gpg to create the sha512 hash file we could just change
> to using sha512sum? That would output the right form
Instead of using gpg to create the sha512 hash file we could just change to
using sha512sum? That would output the right format that is in turns verifiable.
From: Ryan Blue
Sent: Friday, March 16, 2018 8:31:45 AM
To: Nicholas Chammas
Cc: Spark dev list
Subject:
+1 It's possible to produce the same file with gpg, but the sha*sum
utilities are a bit easier to remember the syntax for.
On Thu, Mar 15, 2018 at 9:01 PM, Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:
> To verify that I’ve downloaded a Hadoop release correctly, I can just do
> this:
>
>
11 matches
Mail list logo