> Begin forwarded message:
>
> From: "Chen, Kevin"
> Subject: Re: Missing output partition file in S3
> Date: September 19, 2016 at 10:54:44 AM PDT
> To: Steve Loughran
> Cc: "user@spark.apache.org"
>
> Hi Steve,
>
> Our S3 is on US east. But this issue also occurred when we using a S3 buck
Memory, SSD, and/or HDDs with the DFS as the persistent store, called
under-filesystem.
Hope this helps.
Richard Catlin
> On Sep 19, 2016, at 7:56 AM, aka.fe2s wrote:
>
> Hi folks,
>
> What has happened with Tachyon / Alluxio in Spark 2? Doc doesn't mention it
> no lo
I believe it depends on your Spark application.
To write to Hive, use
dataframe.saveAsTable
To write to S3, use
dataframe.write.parquet(“s3://”)
Hope this helps.
Richard
> On Jun 16, 2016, at 9:54 AM, Natu Lauchande wrote:
>
> Does
P BY s.Id
Thank you.
Richard Catlin
I have two Dataframes. A "users" DF, and an "investments" DF. The
"investments" DF has a column that matches the "users" id. I would like to
nest the collection of investments for each user and save to a parquet file.
Is there a straightforward way to do this?
Thanks.
Richard Catlin
Michael,
I have two Dataframes. A "users" DF, and an "investments" DF. The
"investments" DF has a column that matches the "users" id. I would like to
nest the collection of investments for each user and save to a parquet file.
Is there a straightforward
How do I create a DataFrame(SchemaRDD) with a nested array of Rows in a
column? Is there an example? Will this store as a nested parquet file?
Thanks.
Richard Catlin
I would like to write pdf files using pdfbox to HDFS from my Spark
application. Can this be done?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Can-a-Spark-App-run-with-spark-submit-write-pdf-files-to-HDFS-tp23233.html
Sent from the Apache Spark User List