Re: off heap to alluxio/tachyon in Spark 2

2016-09-19 Thread Richard Catlin
it with Memory, SSD, and/or HDDs with the DFS as the persistent store, called under-filesystem. Hope this helps. Richard Catlin > On Sep 19, 2016, at 7:56 AM, aka.fe2s <aka.f...@gmail.com> wrote: > > Hi folks, > > What has happened with Tachyon / Alluxio in Spark 2? Doc doesn't me

Fwd: Missing output partition file in S3

2016-09-19 Thread Richard Catlin
> Begin forwarded message: > > From: "Chen, Kevin" > Subject: Re: Missing output partition file in S3 > Date: September 19, 2016 at 10:54:44 AM PDT > To: Steve Loughran > Cc: "user@spark.apache.org" > > Hi Steve, > >

Re: difference between dataframe and dataframwrite

2016-06-16 Thread Richard Catlin
I believe it depends on your Spark application. To write to Hive, use dataframe.saveAsTable To write to S3, use dataframe.write.parquet(“s3://”) Hope this helps. Richard > On Jun 16, 2016, at 9:54 AM, Natu Lauchande wrote: > > Does

RE: Nested DataFrames

2015-06-25 Thread Richard Catlin
. Richard Catlin

Re: Nested DataFrame(SchemaRDD)

2015-06-24 Thread Richard Catlin
Michael, I have two Dataframes. A users DF, and an investments DF. The investments DF has a column that matches the users id. I would like to nest the collection of investments for each user and save to a parquet file. Is there a straightforward way to do this? Thanks. Richard Catlin On Tue

Nesting DataFrames and saving to Parquet

2015-06-24 Thread Richard Catlin
I have two Dataframes. A users DF, and an investments DF. The investments DF has a column that matches the users id. I would like to nest the collection of investments for each user and save to a parquet file. Is there a straightforward way to do this? Thanks. Richard Catlin

RE: Nested DataFrame(SchemaRDD)

2015-06-23 Thread Richard Catlin
How do I create a DataFrame(SchemaRDD) with a nested array of Rows in a column? Is there an example? Will this store as a nested parquet file? Thanks. Richard Catlin

Can a Spark App run with spark-submit write pdf files to HDFS

2015-06-09 Thread Richard Catlin
I would like to write pdf files using pdfbox to HDFS from my Spark application. Can this be done? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-a-Spark-App-run-with-spark-submit-write-pdf-files-to-HDFS-tp23233.html Sent from the Apache Spark User