Re: how to orderBy previous groupBy.count.orderBy in pyspark

2016-05-03 Thread webe3vt
Here is what I ended up doing. Improvements are welcome. from pyspark.sql import SQLContext, Row from pyspark.sql.types import StructType, StructField, IntegerType, StringType from pyspark.sql.functions import asc, desc, sum, count sqlContext = SQLContext(sc) error_schema = StructType([

how to orderBy previous groupBy.count.orderBy in pyspark

2016-05-02 Thread webe3vt
I have the following simple example that I can't get to work correctly. In [1]: from pyspark.sql import SQLContext, Row from pyspark.sql.types import StructType, StructField, IntegerType, StringType from pyspark.sql.functions import asc, desc, sum, count sqlContext = SQLContext(sc) error_schema

Java exception when showing join

2016-04-22 Thread webe3vt
I am using pyspark with netezza. I am getting a java exception when trying to show the first row of a join. I can show the first row for of the two dataframes separately but not the result of a join. I get the same error for any action I take(first, collect, show). Am I doing something