Powered By Spark

2018-12-22 Thread Ascot Moss
Hi, We use Apache Spark for many cases, can anyone advise how to add our sharing to "http://spark.apache.org/powered-by.html; Thanks and Happy Holidays!

WARN HIVE: Failed to access metastore. This class should not accessed in runtime

2017-08-14 Thread Ascot Moss
Hi, I got following error when start spark thriftserver: WARN HIVE: Failed to access metastore. This class should not accessed in runtime. org.apache.hadoop.hive.ql.metadata.HiveException: Java.lang.RuntimeException: Unable to instantiate

ThriftServer on HTTPS

2017-08-12 Thread Ascot Moss
Hi, I have Spark ThriftServer up and running on HTTP, where can I find the steps to setup Spark ThriftServer on HTTPS? Regards

Re: ERROR transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to f

2017-08-12 Thread Ascot Moss
I fixed the issue (as no permission on keytab file), please ignore. On Sun, Aug 13, 2017 at 9:42 AM, Ascot Moss <ascot.m...@gmail.com> wrote: > Hi, > > Spark: 2.1.0 > Hive: 2.1.1 > > When starting thrift server, I got the following error: > > How can I fix it?

ThriftServer Start Error

2017-08-12 Thread Ascot Moss
I tried to start spark-thrift server but get following error: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] java.io.IOException: javax.security.sasl.SaslException: GSS initiate

ThriftServer Start Error

2017-08-11 Thread Ascot Moss
Hi I tried to start spark-thrift server but get following error: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] java.io.IOException: javax.security.sasl.SaslException: GSS initiate

Spark ThriftServer Error

2017-08-11 Thread Ascot Moss
Hi, When started thriftSever, got the following issue: 17/08/11 16:06:56 ERROR util.Utils: Uncaught exception in thread Thread-3 java.lang.NullPointerException at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$$anonfun$main$1.apply$mcV$sp(HiveThriftServer2.scala:85) at

Spark Thriftserver ERROR

2017-08-11 Thread Ascot Moss
Hi I tried to start spark-thrift server but get following error: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] java.io.IOException: javax.security.sasl.SaslException: GSS initiate

Re: Spark 2.0 Build Failed

2016-07-29 Thread Ascot Moss
tionException On Fri, Jul 29, 2016 at 1:46 PM, Ascot Moss <ascot.m...@gmail.com> wrote: > I just run > > wget https://repo1.maven.org/maven2/org/apache/apache/14/apache-14.pom, > can get it without issue. > > On Fri, Jul 29, 2016 at 1:44 PM, Ascot Moss <ascot.m...@gma

Re: Spark 2.0 Build Failed

2016-07-28 Thread Ascot Moss
I just run wget https://repo1.maven.org/maven2/org/apache/apache/14/apache-14.pom, can get it without issue. On Fri, Jul 29, 2016 at 1:44 PM, Ascot Moss <ascot.m...@gmail.com> wrote: > Hi thanks! > > mvn dependency:tree > > [INFO] Scanning for projects... > > D

Re: Spark 2.0 Build Failed

2016-07-28 Thread Ascot Moss
eption On Fri, Jul 29, 2016 at 1:34 PM, Dong Meng <mengdong0...@gmail.com> wrote: > Before build, first do a "mvn dependency:tree" to make sure the > dependency is right > > On Thu, Jul 28, 2016 at 10:18 PM, Ascot Moss <ascot.m...@gmail.com> wrote: > >> Th

Re: Spark 2.0 Build Failed

2016-07-28 Thread Ascot Moss
n repo. It's not > related to Spark. > > On Thu, Jul 28, 2016 at 4:04 PM, Ascot Moss <ascot.m...@gmail.com> wrote: > > Hi, > > > > I tried to build spark, > > > > (try 1) > > mvn -Pyarn -Phadoop-2.7.0 -Dscala-2.11 -Dhadoop.version=2.7.0 -Phive > &g

Spark 2.0 Build Failed

2016-07-28 Thread Ascot Moss
Hi, I tried to build spark, (try 1) mvn -Pyarn *-Phadoop-2.7.0* *-Dscala-2.11* -Dhadoop.version=2.7.0 -Phive -Phive-thriftserver -DskipTests clean package [INFO] Spark Project Parent POM ... FAILURE [ 0.658 s] [INFO] Spark Project Tags .

Re: saveAsTextFile at treeEnsembleModels.scala:447, took 2.513396 s Killed

2016-07-28 Thread Ascot Moss
it jelps. > Florin > > > On Thu, Jul 28, 2016 at 3:49 AM, Ascot Moss <ascot.m...@gmail.com> wrote: > >> >> Hi, >> >> Please help! >> >> When saving the model, I got following error and cannot save the model to >> hdfs: >> >>

A question about Spark Cluster vs Local Mode

2016-07-27 Thread Ascot Moss
Hi, If I submit the same job to spark in cluster mode, does it mean in cluster mode it will be run in cluster memory pool and it will fail if it runs out of cluster's memory? --driver-memory 64g \ --executor-memory 16g \ Regards

DecisionTree currently only supports maxDepth <= 30

2016-07-27 Thread Ascot Moss
Hi, Is there any reason behind to limit maxDepth <= 30? Can it be deeper? Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: DecisionTree currently only supports maxDepth <= 30, but was given maxDepth = 50. at

saveAsTextFile at treeEnsembleModels.scala:447, took 2.513396 s Killed

2016-07-27 Thread Ascot Moss
Hi, Please help! When saving the model, I got following error and cannot save the model to hdfs: (my source code, my spark is v1.6.2) my_model.save(sc, "/my_model") - 16/07/28 08:36:19 INFO TaskSchedulerImpl: Removed TaskSet 69.0, whose tasks have all

Re: DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651, took 19.556700 s Killed

2016-07-26 Thread Ascot Moss
it.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > > On Tue, Jul 26, 2016 at 1:27 AM, Ascot Moss <ascot.m...@gmail.com> wrote: > > Hi, > > > > spark: 1.6.1 > > java: java 1.8_u40 > > I tried random forest training phase, the same

Re: DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651, took 19.556700 s Killed

2016-07-26 Thread Ascot Moss
any ideas? On Tuesday, July 26, 2016, Ascot Moss <ascot.m...@gmail.com> wrote: > Hi, > > spark: 1.6.1 > java: java 1.8_u40 > I tried random forest training phase, the same code works well if with 20 > trees (lower accuracy, about 68%). When trying the training phase

DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651, took 19.556700 s Killed

2016-07-25 Thread Ascot Moss
Hi, spark: 1.6.1 java: java 1.8_u40 I tried random forest training phase, the same code works well if with 20 trees (lower accuracy, about 68%). When trying the training phase with more tree, I set to 200 trees, it returned: "DAGScheduler: Job 20 finished: collectAsMap at

Spark 1.6.2 version displayed as 1.6.1

2016-07-24 Thread Ascot Moss
Hi, I am trying to upgrade spark from 1.6.1 to 1.6.2, from 1.6.2 spark-shell, I found the version is still displayed 1.6.1 Is this a minor typo/bug? Regards ### Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\

Re: Size exceeds Integer.MAX_VALUE

2016-07-24 Thread Ascot Moss
ory available to > spark and caching the data in memory. Make sure you are using Kryo > serialization. > > Andrew > > On Jul 23, 2016, at 9:00 PM, Ascot Moss <ascot.m...@gmail.com> wrote: > > > Hi, > > Please help! > > My spark: 1.6.2 > Java: java8_u40 > >

Size exceeds Integer.MAX_VALUE

2016-07-23 Thread Ascot Moss
Hi, Please help! My spark: 1.6.2 Java: java8_u40 I am trying random forest training, I got " Size exceeds Integer.MAX_VALUE". Any idea how to resolve it? (the log) 16/07/24 07:59:49 ERROR Executor: Exception in task 0.0 in stage 7.0 (TID 25) java.lang.IllegalArgumentException: Size exceeds

Re: ERROR Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

2016-07-23 Thread Ascot Moss
16 at 6:37 AM, Ascot Moss <ascot.m...@gmail.com> wrote: > My JDK is Java 1.8 u40 > > On Sun, Jul 24, 2016 at 3:45 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Since you specified +PrintGCDetails, you should be able to get some more >> detail from the GC lo

Re: ERROR Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

2016-07-23 Thread Ascot Moss
re G1GC is more reliable. > > On Sat, Jul 23, 2016 at 10:38 AM, Ascot Moss <ascot.m...@gmail.com> wrote: > >> Hi, >> >> I added the following parameter: >> >> --conf "spark.executor.extraJavaOptions=-XX:+UseG1GC >> -XX:MaxGCPauseMillis=200 -XX:Paral

Re: ERROR Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

2016-07-23 Thread Ascot Moss
.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Regards On Sat, Jul 23, 2016 at 9:49 AM, Ascot Moss <ascot.m...@gmail.com> wrote: > Thanks

ERROR Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

2016-07-22 Thread Ascot Moss
Hi Please help! When running random forest training phase in cluster mode, I got GC overhead limit exceeded. I have used two parameters when submitting the job to cluster --driver-memory 64g \ --executor-memory 8g \ My Current settings: (spark-defaults.conf) spark.executor.memory

Random Forest gererate model failed (DecisionTree.scala:642), which has no missing parents

2016-07-15 Thread Ascot Moss
Hi, I am trying to create the Random Forest model, my source_code as follows: val rf_model = RandomForest.trainClassifier(trainData, 7, Map[Int,Int](), 20, "auto", "entropy", 30, 300) I got following error: ## 16/07/15 19:55:04 INFO TaskSchedulerImpl: Removed TaskSet 21.0, whose tasks

Random Forest Job got killed (DAGScheduler: failed: Set() , DecisionTree.scala:642), which has no missing parents)

2016-07-15 Thread Ascot Moss
Hi, I am trying to create the Random Forest model, my source_code as follows: val rf_model = edhRF.trainClassifier(trainData, 7, Map[Int,Int](), 20, "auto", "entropy", 30, 300) I got following error: ## 16/07/15 19:55:04 INFO TaskSchedulerImpl: Removed TaskSet 21.0, whose tasks have all

LogisticRegression.scala ERROR, require(Predef.scala)

2016-06-23 Thread Ascot Moss
Hi, My Spark is 1.5.2, when trying MLLib, I got the following error. Any idea to fix it? Regards == 16/06/23 16:26:20 ERROR Executor: Exception in task 0.0 in stage 5.0 (TID 5) java.lang.IllegalArgumentException: requirement failed at

Re: Apache Flink

2016-04-16 Thread Ascot Moss
I compared both last month, seems to me that Flink's MLLib is not yet ready. On Sun, Apr 17, 2016 at 12:23 AM, Mich Talebzadeh wrote: > Thanks Ted. I was wondering if someone is using both :) > > Dr Mich Talebzadeh > > > > LinkedIn * >

ERROR ArrayBuffer(java.nio.channels.ClosedChannelException

2016-03-19 Thread Ascot Moss
Hi, I have a SparkStream (with Kafka) job, after running several days, it failed with following errors: ERROR DirectKafkaInputDStream: ArrayBuffer(java.nio.channels.ClosedChannelException) Any idea what would be wrong? will it be SparkStreaming buffer overflow issue? Regards *** from

How to compile Python and use How to compile Python and use spark-submit

2016-01-08 Thread Ascot Moss
Hi, Instead of using Spark-shell, does anyone know how to build .zip (or .egg) for Python and use Spark-submit to run? Regards

Pivot Data in Spark and Scala

2015-10-29 Thread Ascot Moss
Hi, I have data as follows: A, 2015, 4 A, 2014, 12 A, 2013, 1 B, 2015, 24 B, 2013 4 I need to convert the data to a new format: A ,4,12,1 B, 24,,4 Any idea how to make it in Spark Scala? Thanks

Spark: How to find similar text title

2015-10-20 Thread Ascot Moss
Hi, I have my RDD that stores the titles of some articles: 1. "About Spark Streaming" 2. "About Spark MLlib" 3. "About Spark SQL" 4. "About Spark Installation" 5. "Kafka Streaming" 6. "Kafka Setup" 7. I need to build a model to find titles by similarity, e.g if given "About Spark", hope to