Hi dev:
I am using Spark-Shell to run the example which is in section
'http://spark.apache.org/docs/2.2.2/sql-programming-guide.html#type-safe-user-defined-aggregate-functions',
and there is an error:
*Caused by: java.io.NotSerializableException:
org.apache.spark.sql.TypedColumn
Serialization s
Hi guys:
I am using spark 2.1.1 to test on CDH 5.7.1, when i run on yarn with
following command, error 'NoSuchMethodError:
org.apache.spark.network.client.TransportClient.getChannel()Lio/netty/channel/Channel;'
appears sometimes:
command:
*su cloudera-scm -s "/bin/sh" -c "/opt/spark2/bin/spa
I use command to run Unit test, as follow:
./make-distribution.sh --tgz --skip-java-test -Pscala-2.10 -Phadoop-2.3
-Phive -Phive-thriftserver -Pyarn -Dyarn.version=2.3.0-cdh5.1.2
-Dhadoop.version=2.3.0-cdh5.1.2
mvn -Pscala-2.10 -Phadoop-2.3 -Phive -Phive-thriftserver -Pyarn
-Dyarn.version=2.3.0-cd
My test env:1. Spark version is 1.3.02. 3 node per 80G/20C3. read 250G
parquet files from hdfs Test case:1. register "floor" func with command:
*sqlContext.udf.register("floor", (ts: Int) => ts - ts % 300), *then run
with sql "select chan, floor(ts) as tt, sum(size) from qlogbase3 group by
chan, fl