SQL warehouse dir

2017-02-10 Thread Joseph Naegele
Hi all, I've read the docs for Spark SQL 2.1.0 but I'm still having issues with the warehouse and related details. I'm not using Hive proper, so my hive-site.xml consists only of: javax.jdo.option.ConnectionURL jdbc:derby:;databaseName=/mnt/data/spark/metastore_db;create=true I've set

Spark SQL 1.6.3 ORDER BY and partitions

2017-01-06 Thread Joseph Naegele
I have two separate but similar issues that I've narrowed down to a pretty good level of detail. I'm using Spark 1.6.3, particularly Spark SQL. I'm concerned with a single dataset for now, although the details apply to other, larger datasets. I'll call it "table". It's around 160 M records,

Storage history in web UI

2017-01-03 Thread Joseph Naegele
Hi all, Is there any way to observe Storage history in Spark, i.e. which RDDs were cached and where, etc. after an application completes? It appears the Storage tab in the History Server UI is useless. Thanks --- Joe Naegele Grier Forensics

RE: [Spark SQL] Task failed while writing rows

2016-12-19 Thread Joseph Naegele
. Thanks --- Joe Naegele Grier Forensics From: Michael Stratton [mailto:michael.strat...@komodohealth.com] Sent: Monday, December 19, 2016 10:00 AM To: Joseph Naegele <jnaeg...@grierforensics.com> Cc: user <user@spark.apache.org> Subject: Re: [Spark SQL] Task failed while

[Spark SQL] Task failed while writing rows

2016-12-18 Thread Joseph Naegele
Hi all, I'm having trouble with a relatively simple Spark SQL job. I'm using Spark 1.6.3. I have a dataset of around 500M rows (average 128 bytes per record). It's current compressed size is around 13 GB, but my problem started when it was much smaller, maybe 5 GB. This dataset is generated

spark nightly builds with Hadoop 2.7

2016-09-09 Thread Joseph Naegele
Hello, I'm using the Spark nightly build "spark-2.1.0-SNAPSHOT-bin-hadoop2.7" from http://people.apache.org/~pwendell/spark-nightly/spark-master-bin/ due to bugs in Spark 2.0.0 (SPARK-16740, SPARK-16802), however I noticed that the recent builds only come in "-hadoop2.4-without-hive" and