RE: SequenceFile and object reuse

2015-11-19 Thread andrew.rowson
As I understand it, it's down to how Hadoop FileInputFormats work, and questions of mutability. If you were to read a file from Hadoop via an InputFormat with a simple Java program, the InputFormat's RecordReader creates a single, mutable instance of the Writable key class and a single, mutable

EOFException on History server reading in progress lz4

2015-09-03 Thread andrew.rowson
I'm trying to solve a problem of the history server spamming my logs with EOFExceptions when it tries to read a history file from HDFS that is both lz4 compressed and incomplete. The actual exception is: java.io.EOFException: Stream ended prematurely at

RE: Spark builds: allow user override of project version at buildtime

2015-08-26 Thread andrew.rowson
So, I actually tried this, and it built without problems, but publishing the artifacts to artifactory ended up with some strangeness in the child poms, where the property wasn’t resolved. This leads to issues pulling them into other projects of: “Could not find

Spark builds: allow user override of project version at buildtime

2015-08-25 Thread andrew.rowson
I've got an interesting challenge in building Spark. For various reasons we do a few different builds of spark, typically with a few different profile options (e.g. against different versions of Hadoop, some with/without Hive etc.). We mirror the spark repo internally and have a buildserver that