subject:"partitioned parquet tables"

partitioned parquet tables

2016-04-01 Thread Imran Akbar

Hi, I'm reading in a CSV file, and I would like to write it back as a permanent table, but with partitioning by year, etc. Currently I do this: from pyspark.sql import HiveContext sqlContext = HiveContext(sc) df = sqlContext.read.format('com.databricks.spark.csv').options(header='true',

RE: Creating Partitioned Parquet Tables via SparkSQL

2015-04-01 Thread Felix Cheung

This is tracked by these JIRAs.. https://issues.apache.org/jira/browse/SPARK-5947 https://issues.apache.org/jira/browse/SPARK-5948 From: denny.g@gmail.com Date: Wed, 1 Apr 2015 04:35:08 + Subject: Creating Partitioned Parquet Tables via SparkSQL To: user@spark.apache.org Creating

Re: Creating Partitioned Parquet Tables via SparkSQL

2015-04-01 Thread Denny Lee

: Wed, 1 Apr 2015 04:35:08 + Subject: Creating Partitioned Parquet Tables via SparkSQL To: user@spark.apache.org Creating Parquet tables via .saveAsTable is great but was wondering if there was an equivalent way to create partitioned parquet tables. Thanks!