@Reynold Xin: not really: it only works for Parquet (see partitionBy:
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameWriter),
it requires you to have a DataFrame in the first place (for my use case the
spark sql interface to avro records is more of a
Howdy folks!
I’m interested in hearing about what people think of spark-ec2
http://spark.apache.org/docs/latest/ec2-scripts.html outside of the
formal JIRA process. Your answers will all be anonymous and public.
If the embedded form below doesn’t work for you, you can use this link to
get the
I am comparing the log of Spark line by line between the hanging case (big
dataset) and not hanging case (small dataset).
In the hanging case, the Spark's log looks identical with not hanging case for
reading the first block data from the HDFS.
But after that, starting from line 438 in the
This should be fixed now. I just triggered a manual build and the
latest binaries are at
http://people.apache.org/~pwendell/spark-nightly/spark-master-bin/spark-1.5.0-SNAPSHOT-2015_08_17_00_36-3ff81ad-bin/
Thanks
Shivaram
On Mon, Aug 17, 2015 at 12:26 AM, Olivier Girardot
Hi Nick,
I forgot to mention in the survey that ganglia is never installed properly
for some reasons.
I have this exception every time I launched the cluster:
Starting httpd: httpd: Syntax error on line 154 of
/etc/httpd/conf/httpd.conf: Cannot load
/etc/httpd/modules/mod_authz_core.so into