Re: aliasing aggregate columns?

2015-04-17 Thread elliott cordo
({cool_cnt:sum.alias(cool_cnt),*:count.alias(cnt)}) On Wed, Apr 15, 2015 at 7:23 PM, elliott cordo elliottco...@gmail.com wrote: Hi Guys - Having trouble figuring out the semantics for using the alias function on the final sum and count aggregations? cool_summary = reviews.select(reviews.user_id

aliasing aggregate columns?

2015-04-15 Thread elliott cordo
Hi Guys - Having trouble figuring out the semantics for using the alias function on the final sum and count aggregations? cool_summary = reviews.select(reviews.user_id, cool_cnt(votes.cool).alias(cool_cnt)).groupBy(user_id).agg({cool_cnt:sum,*:count}) cool_summary DataFrame[user_id: string,

trouble with jdbc df in python

2015-03-25 Thread elliott cordo
if i run the following: db = sqlContext.load(jdbc, url=jdbc:postgresql://localhost/xx, dbtables=mstr.d_customer) i get the error: py4j.protocol.Py4JJavaError: An error occurred while calling o28.load. : java.io.FileNotFoundException: File file:/Users/elliottcordo/jdbc does not exist Seems to

Re: trouble with jdbc df in python

2015-03-25 Thread elliott cordo
at 6:12 PM, Michael Armbrust mich...@databricks.com wrote: Try: db = sqlContext.load(source=jdbc, url=jdbc:postgresql://localhost/xx, dbtables=mstr.d_customer) On Wed, Mar 25, 2015 at 2:19 PM, elliott cordo elliottco...@gmail.com wrote: if i run the following: db = sqlContext.load(jdbc

JdbcRdd for Python

2015-01-02 Thread elliott cordo
Hi All - Is JdbcRdd currently supported? Having trouble finding any info or examples?

Re: JdbcRdd for Python

2015-01-02 Thread elliott cordo
it implemented. From: elliott cordo elliottco...@gmail.com Date: Friday, January 2, 2015 at 8:17 AM To: user@spark.apache.org user@spark.apache.org Subject: JdbcRdd for Python Hi All - Is JdbcRdd currently supported? Having trouble finding any info or examples?

hiveContext.jsonFile fails with Unexpected close marker

2014-12-24 Thread elliott cordo
I have generally been impressed with the way jsonFile eats just about any json data model.. but getting this error when i try to ingest this file: Unexpected close marker ']': expected '} Here are the commands from the pyspark shell: from pyspark.sql import HiveContext hiveContext =