Can you add sc.stop at the end of the code and try?
On 1 Dec 2016 18:03, "Daniel van der Ende"
wrote:
> Hi,
>
> I've seen this a few times too. Usually it indicates that your driver
> doesn't have enough resources to process the result. Sometimes increasing
> driver
Hi,
I've seen this a few times too. Usually it indicates that your driver
doesn't have enough resources to process the result. Sometimes increasing
driver memory is enough (yarn memory overhead can also help). Is there any
specific reason for you to run in client mode and not in cluster mode?
In Spark > 2.0 , spark session was introduced that you can use to query
hive as well.
Just make sure you create spark session with enableHiveSupport() option.
Thanks
Deepak
On Thu, Dec 1, 2016 at 12:27 PM, shyla deshpande
wrote:
> I am Spark 2.0.2 , using DStreams
Hi,
Looks like the ordering of your parameters to spark submit is different on
Windows vs EMR. I assume the -h flag is for an arguments for your python
script? In that case you'll need to put the arguments after the python
script.
Daniel
On 1 Dec 2016 6:24 a.m., "Patnaik, Vandana"
I am Spark 2.0.2 , using DStreams because I need Cassandra Sink.
How do I create SQLContext? I get the error SQLContext deprecated.
*[image: Inline image 1]*
*Thanks*
Here is another transformation that might cause the error but it has to be
one of these two since I only have two transformations
jsonMessagesDStream
.window(new Duration(6), new Duration(1000))
.mapToPair(new PairFunction() {
@Override
Hi Marco,
Here is what my code looks like
Config config = new Config("hello");
SparkConf sparkConf = config.buildSparkConfig();
sparkConf.setJars(JavaSparkContext.jarOfClass(Driver.class));
JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new
Thanks Miguel for the response.
Works great. I am having a tuple for my key and the values are String and
returning String to the updateStateByKey.
On Wed, Nov 30, 2016 at 12:33 PM, Miguel Morales
wrote:
> I *think* you can return a map to updateStateByKey which would
8080 is just the normal web UI. Which is the information I want, ie
Running Applications, but in HTML format. I want it in JSON so I don't
have to be scraping and parsing HTML.
From my understanding api/v1/applications should do the trick ...
except it doesn't.
Ah well.
On 1/12/2016 4:00
Don't have a Spark cluster up to verify this, but try port 8080.
http://spark-master-ip:8080/api/v1/applications.
But glad to hear you're getting somewhere, best of luck.
On Wed, Nov 30, 2016 at 9:59 PM, Carl Ballantyne wrote:
> Hmmm getting closer I think.
>
> I
Hmmm getting closer I think.
I thought this was only for Mesos and Yarn clusters (from reading the
documentation). I tried anyway and initially received Connection
Refused. So I ran ./start-history-server.sh. This was on the Spark
Master instance.
I now get 404 not found.
Nothing in the
Try hitting: http://:18080/api/v1
Then hit /applications.
That should give you a list of running spark jobs on a given server.
On Wed, Nov 30, 2016 at 9:30 PM, Carl Ballantyne
wrote:
>
> Yes I was looking at this. But it says I need to access the driver -
>
Yes I was looking at this. But it says I need to access the driver -
|http://:4040.|
I don't have a running driver Spark instance since I am submitting jobs
to Spark using the SparkLauncher class. Or maybe I am missing something
obvious. Apologies if so.
On 1/12/2016 3:21 PM, Miguel
Hello All,
I am new to spark and am wondering how to pass an optional argument to my
python program using SPARK-SUBMIT.
This works fine on my local machine but not on AWS EMR:
On Windows:
C:\Vandana\spark\examples>..\bin\spark-submit new_profile_csv1.py -h 0 -t
exam ple_float.txt On EMR:
Check the Monitoring and Instrumentation API:
http://spark.apache.org/docs/latest/monitoring.html
On Wed, Nov 30, 2016 at 9:20 PM, Carl Ballantyne wrote:
> Hi All,
>
> I want to get the running applications for my Spark Standalone cluster in
> JSON format. The same
Hi All,
I want to get the running applications for my Spark Standalone cluster
in JSON format. The same information displayed on the web UI on port
8080 ... but in JSON.
Is there an easy way to do this? It seems I need to scrap the HTML page
in order to get this information.
The reason I
Hi Spark expert,
Can anyone help for doing SVR (Support vector machine regression) in
SPARK.
Thanks
R
On Tue, Nov 29, 2016 at 6:50 PM, roni wrote:
> Hi All,
> I am trying to change my R code to spark. I am using SVM regression in R
> . It seems like spark is providing
Spark 2.0.1 is running with a different py4j library than Spark 1.6.
You will probably run into other problems mixing versions though - is there a
reason you can't run Spark 1.6 on the client?
_
From: Klaus Schaefers
Could you paste reproducible snippet code?
Kr
On 30 Nov 2016 9:08 pm, "kant kodali" wrote:
> I have lot of these exceptions happening
>
> java.lang.Exception: Could not compute split, block input-0-1480539568000
> not found
>
>
> Any ideas what this could be?
>
I have lot of these exceptions happening
java.lang.Exception: Could not compute split, block input-0-1480539568000
not found
Any ideas what this could be?
I *think* you can return a map to updateStateByKey which would include your
fields. Another approach would be to create a hash (like create a json
version of the hash and return that.)
On Wed, Nov 30, 2016 at 12:30 PM, shyla deshpande
wrote:
> updateStateByKey - Can
updateStateByKey - Can this be used when the key is multi-column (like a
composite key ) and the value is not numeric. All the examples I have come
across is where the key is a simple String and the Value is numeric.
Appreciate any help.
Thanks
Dear Apache enthusiast,
ApacheCon and Apache Big Data will be held at the Intercontinental in
Miami, Florida, May 16-18, 2017. Submit your talks, and register, at
http://apachecon.com/ Talks aimed at the Big Data section of the event
should go to
This should fix it: https://github.com/apache/spark/pull/16080
On Wed, Nov 30, 2016 at 10:55 AM, Timur Shenkao wrote:
> Hello,
>
> Yes, I used hiveContext, sqlContext, sparkSession from Java, Scala,
> Python.
> Via spark-shell, spark-submit, IDE (PyCharm, Intellij IDEA).
>
Hello,
Yes, I used hiveContext, sqlContext, sparkSession from Java, Scala, Python.
Via spark-shell, spark-submit, IDE (PyCharm, Intellij IDEA).
Everything is perfect because I have Hadoop cluster with configured & tuned
HIVE.
The reason of Michael's error is usually misconfigured or absent HIVE.
Hi Timur,
did you use hiveContext or sqlContext or the spark way mentioned in the
http://spark.apache.org/docs/latest/sql-programming-guide.html?
Regards,
Gourav Sengupta
On Wed, Nov 30, 2016 at 5:35 PM, Yin Huai wrote:
> Hello Michael,
>
> Thank you for reporting this
Hi Sean,
I think that the main issue was users importing the package while starting
SPARK just like the way we used to do in SPARK 1.6. After removing that
option from --package while starting SPARK 2.0 the issue of conflicting
libraries disappeared.
I have written about this in
Hello Michael,
Thank you for reporting this issue. It will be fixed by
https://github.com/apache/spark/pull/16080.
Thanks,
Yin
On Tue, Nov 29, 2016 at 11:34 PM, Timur Shenkao wrote:
> Hi!
>
> Do you have real HIVE installation?
> Have you built Spark 2.1 & Spark 2.0 with
Hi Folks,
I have a spark job reading a csv file into a dataframe. I register that
dataframe as a tempTable then I’m writing that dataframe/tempTable to hive
external table (using parquet format for storage)
I’m using this kind of command :
hiveContext.sql(*"INSERT INTO TABLE t
Hi,
I want to connect with a local Jupyter Notebook to a remote Spark cluster.
The Cluster is running Spark 2.0.1 and the Jupyter notebook is based on
Spark 1.6 and running in a docker image (Link). I try to init the
SparkContext like this:
import pyspark
sc =
HI All,
I am wondering if it makes sense to have two receivers inside my Spark
Client program?
The use case is as follows.
1) We have to support a feed from Kafka so this will be a direct receiver
#1. We need to perform batch inserts from kafka feed to Cassandra.
2) an gRPC receiver where we
Unsubscribe
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hello,
I have been trying to implement logistic regression using gradient ascent,
out of curiosity. I am using Spark ML feature extraction packages and data
frames, and not any of the implemented algorithms. I will be grateful if
any of you could please cast an eye and provide some feedback.
34 matches
Mail list logo