urrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
--
*Alexander Krasheninnikov*
Head of Data Team
adPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
--
*Alexander Krasheninnikov*
Head of Data Team
Hello!
While operating the JavaDStream you may use a transform() or foreach()
methods, which give you an access to an RDD.
JavaDStream dataFrameStream =
ctx.textFileStream("source").transform(new Function2() {
@Override
public JavaRDD call(JavaRDD incomingRdd, Time
If you are profiling in standalone mode, I recommend you to try with Java
Mission Control.
You just need to start app with these params:
-XX:+UnlockCommercialFeatures -XX:+FlightRecorder
-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.port=$YOUR_PORT
Collect all your rdds from single files into List, then call
context.union(context.emptyRdd(), YOUR_LIST); Otherwise, on greater number
of elements to union, you will get stack overflow exception.
On Wed, Sep 16, 2015 at 10:17 PM, Shawn Carroll
wrote:
> Your loop
Hello, everyone!
I have a case, when running standalone cluster: on master
stop-all.sh/star-all.sh are invoked, streaming app loses all it's
executors, but does not interrupt.
Since it is a streaming app, expected to get it's results ASAP, an
downtime is undesirable.
Is there any workaround
I saw such example in docs:
--conf
spark.driver.extraJavaOptions=-Dlog4j.configuration=file://$path_to_file
but, unfortunately, it does not work for me.
On 30.07.2015 05:12, canan chen wrote:
Yes, that should work. What I mean is is there any option in
spark-submit command that I can specify
I made such naive implementation:
SparkConf conf =newSparkConf();
conf.setMaster(local[4]).setAppName(Stub);
finalJavaSparkContext ctx =newJavaSparkContext(conf);
JavaRDDString input = ctx.textFile(path_to_file);
// explode each line into list of column values
JavaRDDListString rowValues =
Have you tried putting this file on local disk on each of executor
nodes? That worked for me.
On 05.06.2015 16:56, nib...@free.fr wrote:
Hello,
I want to override the log4j configuration when I start my spark job.
I tried :
.../bin/spark-submit --class --conf
I had same problem.
The solution, I've found was to use:
JavaStreamingContext streamingContext =
JavaStreamingContext.getOrCreate('checkpoint_dir', contextFactory);
ALL configuration should be performed inside contextFactory. If you try
to configure streamContext after ::getOrCreate, you
Hello, everyone.
I develop stream application, working with window functions - each
window create table and perform some SQL-operations on extracted data.
I met such problem: when using window operations and checkpointing,
application does not start next time.
Here is the code:
11 matches
Mail list logo