Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-04-01 Thread Mich Talebzadeh
Good stuff Khalid. I have created a section in Apache Spark Community Stack called spark foundation. spark-foundation - Apache Spark Community - Slack I invite you to add your weblink to that section.

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-04-01 Thread Khalid Mammadov
Hey AN-TRUONG I have got some articles about this subject that should help. E.g. https://khalidmammadov.github.io/spark/spark_internals_rdd.html Also check other Spark Internals on web. Regards Khalid On Fri, 31 Mar 2023, 16:29 AN-TRUONG Tran Phan, wrote: > Thank you for your information, >

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread Mich Talebzadeh
yes history refers to completed jobs. 4040 is the running jobs you should have screen shots for executors and stages as well. HTH Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited view my Linkedin profile

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread AN-TRUONG Tran Phan
Thank you for your information, I have tracked the spark history server on port 18080 and the spark UI on port 4040. I see the result of these two tools as similar right? I want to know what each Task ID (Example Task ID 0, 1, 3, 4, 5, ) in the images does, is it possible?

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread Mich Talebzadeh
Are you familiar with spark GUI default on port 4040? have a look. HTH Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited view my Linkedin profile

Unusual bug,please help me,i can do nothing!!!

2022-03-30 Thread spark User
ailed to initialize Spark session.org.apache.spark.SparkException: Invalid Spark URL: spark://HeartbeatReceiver@x.168.137.41:49963". When I try to add "x.168.137.41" in 'etc/hosts' it works fine, then use "ctrl+c" again. The result is that it cannot start normally. Please help me

error bug,please help me!!!

2022-03-20 Thread spark User
ailed to initialize Spark session.org.apache.spark.SparkException: Invalid Spark URL: spark://HeartbeatReceiver@x.168.137.41:49963". When I try to add "x.168.137.41" in 'etc/hosts' it works fine, then use "ctrl+c" again. The result is that it cannot start normally. Please help me

please help me: when I write code to connect kafka with spark using python and I run code on jupyer there is error display

2018-09-16 Thread hager
I write code to connect kafka with spark using python and I run code on jupyer my code import os #os.environ['PYSPARK_SUBMIT_ARGS'] = '--jars /home/hadoop/Desktop/spark-program/kafka/spark-streaming-kafka-0-8-assembly_2.10-2.0.0-preview.jar pyspark-shell' os.environ['PYSPARK_SUBMIT_ARGS'] =

Spark streaming giving me a bunch of WARNINGS, please help me understand them

2017-07-09 Thread shyla deshpande
WARN Use an existing SparkContext, some configuration may not take effect. I wanted to restart the spark streaming app, so stopped the running and issued a new spark submit. Why and how it will use a existing SparkContext? WARN Spark is not running in local mode, therefore the

the function of countByValueAndWindow and foreachRDD in DStream, would you like help me understand it please?

2017-06-27 Thread ??????????
HI all, I have code like below: Logger.getLogger("org.apache.spark").setLevel( Level.ERROR) //Logger.getLogger("org.apache.spark.streaming.dstream").setLevel( Level.DEBUG) val conf = new SparkConf().setAppName("testDstream").setMaster("local[4]") //val sc =

Re: the compile of spark stoped without any hints, would you like help me please?

2017-06-25 Thread Ted Yu
i\leveldbjni-all\1.8\ > leveldbjni-all-1.8.jar;C:\Users\shaof\.m2\repository\ > org\apache\commons\commons-lang3\3.5\commons-lang3-3.5. > jar;C:\Users\shaof\.m2\repository\com\fasterxml\ > jackson\core\jackson-databind\2.6.5\jackson-databind-2.6.5. > jar;C:\Users\shaof\.m2\repository\com\google\guava\ > guava\14.

the compile of spark stoped without any hints, would you like help me please?

2017-06-25 Thread ??????????
] --- scala-maven-plugin:3.2.2:compile (scala-compile-first) @ spark-network-common_2.11 --- <stop here for more than 30 minutes. I stopped it and retried again. It stopped at the same point. Would you like help me please?Or tell me which

Please help me out !!!!Getting error while trying to hive java generic udf in spark

2017-01-17 Thread Sirisha Cheruvu
Hi Everyone.. getting below error while running hive java udf from sql context.. org.apache.spark.sql.AnalysisException: No handler for Hive udf class com.nexr.platform.hive.udf.GenericUDFNVL2 because: com.nexr.platform.hive.udf.GenericUDFNVL2.; line 1 pos 26 at

Re: Help me! Spark WebUI is corrupted!

2015-12-31 Thread Aniket Bhatnagar
nnot click the description of active jobs. It seems there is something > missing in my opearing system. I googled it but find nothing. Could anybody > help me? > > > > - > To unsubscribe, e-mail: user-unsub

Help me! Spark WebUI is corrupted!

2015-12-31 Thread LinChen
Screenshot1(Normal WebUI) Screenshot2(Corrupted WebUI) As screenshot2 shows, the format of my Spark WebUI looks strange and I cannot click the description of active jobs. It seems there is something missing in my opearing system. I googled it but find nothing. Could anybody help me

anyone who can help me out with thi error please

2015-12-04 Thread Mich Talebzadeh
Hi, I am trying to make Hive work with Spark. I have been told that I need to use Spark 1.3 and build it from source code WITHOUT HIVE libraries. I have built it as follows: ./make-distribution.sh --name "hadoop2-without-hive" --tgz

Please help me understand TF-IDF Vector structure

2015-03-14 Thread Xi Shen
have no idea what the 1st element is... - I think the 2nd element is a list of the word - I think the 3rd element is a list of tf-idf value of the words in the previous list Please help me understand this structure. Thanks, David

Re: Please help me understand TF-IDF Vector structure

2015-03-14 Thread Xi Shen
is... - I think the 2nd element is a list of the word - I think the 3rd element is a list of tf-idf value of the words in the previous list Please help me understand this structure. Thanks, David

RE: Help me understand the partition, parallelism in Spark

2015-02-26 Thread java8964
Anyone can share any thoughts related to my questions? Thanks From: java8...@hotmail.com To: user@spark.apache.org Subject: Help me understand the partition, parallelism in Spark Date: Wed, 25 Feb 2015 21:58:55 -0500 Hi, Sparkers: I come from the Hadoop MapReducer world, and try to understand

Re: Help me understand the partition, parallelism in Spark

2015-02-26 Thread Imran Rashid
Hi Yong, mostly correct except for: - Since we are doing reduceByKey, shuffling will happen. Data will be shuffled into 1000 partitions, as we have 1000 unique keys. no, you will not get 1000 partitions. Spark has to decide how many partitions to use before it even knows how many

RE: Help me understand the partition, parallelism in Spark

2015-02-26 Thread java8964
lower memory usage vs speed. Hope my understanding is correct. Thanks Yong Date: Thu, 26 Feb 2015 17:03:20 -0500 Subject: Re: Help me understand the partition, parallelism in Spark From: yana.kadiy...@gmail.com To: iras...@cloudera.com CC: java8...@hotmail.com; user@spark.apache.org Imran, I have

Re: Help me understand the partition, parallelism in Spark

2015-02-26 Thread Yana Kadiyska
Imran, I have also observed the phenomenon of reducing the cores helping with OOM. I wanted to ask this (hopefully without straying off topic): we can specify the number of cores and the executor memory. But we don't get to specify _how_ the cores are spread among executors. Is it possible that

Re: Help me understand the partition, parallelism in Spark

2015-02-26 Thread Zhan Zhang
Here is my understanding. When running on top of yarn, the cores means the number of tasks can run in one executor. But all these cores are located in the same JVM. Parallelism typically control the balance of tasks. For example, if you have 200 cores, but only 50 partitions. There will be 150

Re: Help me understand the partition, parallelism in Spark

2015-02-26 Thread Yana Kadiyska
-- Date: Thu, 26 Feb 2015 17:03:20 -0500 Subject: Re: Help me understand the partition, parallelism in Spark From: yana.kadiy...@gmail.com To: iras...@cloudera.com CC: java8...@hotmail.com; user@spark.apache.org Imran, I have also observed the phenomenon of reducing the cores

Help me understand the partition, parallelism in Spark

2015-02-25 Thread java8964
Hi, Sparkers: I come from the Hadoop MapReducer world, and try to understand some internal information of spark. From the web and this list, I keep seeing people talking about increase the parallelism if you get the OOM error. I tried to read document as much as possible to understand the RDD

Please help me get started on Apache Spark

2014-11-20 Thread Saurabh Agrawal
Friends, I am pretty new to Spark as much as to Scala, MLib and the entire Hadoop stack!! It would be so much help if I could be pointed to some good books on Spark and MLib? Further, does MLib support any algorithms for B2B cross sell/ upsell or customer retention (out of the box

Re: Please help me get started on Apache Spark

2014-11-20 Thread Darin McBeath
Take a look at the O'Reilly Learning Spark (Early Release) book.  I've found this very useful. Darin. From: Saurabh Agrawal saurabh.agra...@markit.com To: user@spark.apache.org user@spark.apache.org Sent: Thursday, November 20, 2014 9:04 AM Subject: Please help me get started on Apache

Re: Please help me get started on Apache Spark

2014-11-20 Thread Guibert. J Tchinde
For Spark, You can start with a new book like : https://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch01.html I think the paper book is out now, You can also have a look on tutorials documentation guide available on : https://spark.apache.org/docs/1.1.0/mllib-guide.html

Can anyone help me set memory for standalone cluster?

2014-06-01 Thread Yunmeng Ban
/spark/conf:/~path/spark/assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar -Xms512M -Xmx512M org.apache.spark.executor.CoarseGrainedExecutorBackend The memory seems to be the default number, not 1600M. I don't how to make SPARK_WORKER_MEMORY work. Can anyone help me? Many thanks

Re: Can anyone help me set memory for standalone cluster?

2014-06-01 Thread Aaron Davidson
The memory seems to be the default number, not 1600M. I don't how to make SPARK_WORKER_MEMORY work. Can anyone help me? Many thanks in advance. Yunmeng

help me: Out of memory when spark streaming

2014-05-16 Thread Francis . Hu
Of Memory after moments. I tried to adjust JVM GC arguments to speed up GC process. Actually, it made a little bit change of performance, but workers finally occur OOM. Is there any way to resolve it? it would be appreciated if anyone can help me to get it fixed ! Thanks

Re: help me

2014-05-03 Thread Chris Fregly
and it becomes 10x faster. I am very confused. scala val a = sc.textFile(/user/exobrain/batselem/LUBM1000) scala f.count() Long = 137805557 took 130.809661618 s -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/help-me-tp4598.html Sent from the Apache

Re: help me

2014-05-02 Thread Mayur Rustagi
/LUBM1000) scala f.count() Long = 137805557 took 130.809661618 s -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/help-me-tp4598.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

help me

2014-04-22 Thread Joe L
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/help-me-tp4598.html Sent from the Apache Spark User List mailing list archive at Nabble.com.