https://mvnrepository.com/artifact/org.apache.spark/spark-network-yarn_2.11/2.0.2
On Mon, Dec 12, 2016 at 9:56 PM, Neal Yin wrote:
> Hi,
>
> For dynamic allocation feature, I need spark-xxx-yarn-shuffle.jar. In my
> local spark build, I can see it. But in maven central, I
(-dev)
Just configure your log4j.properties in $SPARK_HOME/conf (or set a
custom $SPARK_CONF_DIR for the history server).
On Thu, Dec 8, 2016 at 7:20 PM, John Fang wrote:
> ./start-history-server.sh
> starting org.apache.spark.deploy.history.HistoryServer, logging
Sure - I wanted to check with admin before sharing. I’ve attached it now,
> does this help?
>
> Many thanks again,
>
> G
>
>
>
>> On 8 Dec 2016, at 20:18, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> Then you probably have a configuration error
> I can run the SparkPi test script. The main difference between it and my
> application is that it doesn’t access HDFS.
>
>> On 8 Dec 2016, at 18:43, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> On Wed, Dec 7, 2016 at 11:54 PM, Gerard Casey <gerardhughca...
On Wed, Dec 7, 2016 at 11:54 PM, Gerard Casey wrote:
> To be specific, where exactly should spark.authenticate be set to true?
spark.authenticate has nothing to do with kerberos. It's for
authentication between different Spark processes belonging to the same
app.
--
have a configuration issue somewhere.
On Wed, Dec 7, 2016 at 1:09 PM, Gerard Casey <gerardhughca...@gmail.com> wrote:
> Thanks.
>
> I’ve checked the TGT, principal and key tab. Where to next?!
>
>> On 7 Dec 2016, at 22:03, Marcelo Vanzin <van...@cloudera.com> wrote:
&
On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey wrote:
> Can anyone point me to a tutorial or a run through of how to use Spark with
> Kerberos? This is proving to be quite confusing. Most search results on the
> topic point to what needs inputted at the point of `sparks
ngMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSu
There's generally an exception in these cases, and you haven't posted
it, so it's hard to tell you what's wrong. The most probable cause,
without the extra information the exception provides, is that you're
using the wrong Hadoop configuration when submitting the job to YARN.
On Mon, Dec 5, 2016
On Tue, Nov 15, 2016 at 5:57 PM, Elkhan Dadashov wrote:
> This is confusing in the sense that, the client needs to stay alive for
> Spark Job to finish successfully.
>
> Actually the client can die or finish (in Yarn-cluster mode), and the spark
> job will successfully
mage: https://]about.me/mti
>>
>> <https://about.me/mti?promo=email_sig_source=email_sig_medium=external_link_campaign=chrome_ext>
>>
>>
>>
>>
>> [image: http://]
>>
>> Tariq, Mohammad
>> about.me/mti
>> [image: http://]
>&
On Thu, Nov 10, 2016 at 2:43 PM, Mohammad Tariq wrote:
> @Override
> public void stateChanged(SparkAppHandle handle) {
> System.out.println("Spark App Id [" + handle.getAppId() + "]. State [" +
> handle.getState() + "]");
> while(!handle.getState().isFinal()) {
ternal_link_campaign=chrome_ext>
>
>
>
>
> [image: http://]
>
> Tariq, Mohammad
> about.me/mti
> [image: http://]
> <http://about.me/mti>
>
>
> On Tue, Nov 8, 2016 at 5:06 AM, Marcelo Vanzin <van...@cloudera.com>
> wrote:
&
On Mon, Nov 7, 2016 at 3:29 PM, Mohammad Tariq wrote:
> I have been trying to use SparkLauncher.startApplication() to launch a Spark
> app from within java code, but unable to do so. However, same piece of code
> is working if I use SparkLauncher.launch().
>
> Here are the
On Sat, Nov 5, 2016 at 2:54 AM, Elkhan Dadashov wrote:
> while (appHandle.getState() == null || !appHandle.getState().isFinal()) {
> if (appHandle.getState() != null) {
> log.info("while: Spark job state is : " + appHandle.getState());
> if
On Fri, Nov 4, 2016 at 1:57 AM, Zsolt Tóth wrote:
> This was what confused me in the first place. Why does Spark ask for new
> tokens based on the renew-interval instead of the max-lifetime?
It could be just a harmless bug, since tokens have a "getMaxDate()"
method
On Thu, Nov 3, 2016 at 3:47 PM, Zsolt Tóth wrote:
> What is the purpose of the delegation token renewal (the one that is done
> automatically by Hadoop libraries, after 1 day by default)? It seems that it
> always happens (every day) until the token expires, no matter
> manager somehow automatically renews the delegation tokens for my
> application?
>
> 2016-11-03 21:34 GMT+01:00 Marcelo Vanzin <van...@cloudera.com>:
>>
>> Sounds like your test was set up incorrectly. The default TTL for
>> tokens is 7 days. Did you ch
Sounds like your test was set up incorrectly. The default TTL for
tokens is 7 days. Did you change that in the HDFS config?
The issue definitely exists and people definitely have run into it. So
if you're not hitting it, it's most definitely an issue with your test
configuration.
On Thu, Nov 3,
On Fri, Oct 28, 2016 at 11:14 AM, Elkhan Dadashov wrote:
> But if the map task will finish before the Spark job finishes, that means
> SparkLauncher will go away. if the SparkLauncher handle goes away, then I
> lose the ability to track the app's state, right ?
>
> I'm
If you look at the "startApplication" method it takes listeners as parameters.
On Fri, Oct 28, 2016 at 10:23 AM, Elkhan Dadashov wrote:
> Hi,
>
> I know that we can use SparkAppHandle (introduced in SparkLauncher version
>>=1.6), and lt the delegator map task stay alive
On Tue, Oct 18, 2016 at 3:01 PM, Elkhan Dadashov wrote:
> Does my map task need to wait until Spark job finishes ?
No...
> Or is there any way, my map task finishes after launching Spark job, and I
> can still query and get status of Spark job outside of map task (or
Use:
spark-submit --jars /path/sqldriver.jar --conf
spark.driver.extraClassPath=sqldriver.jar --conf
spark.executor.extraClassPath=sqldriver.jar
In client mode the driver's classpath needs to point to the full path,
not just the name.
On Wed, Sep 14, 2016 at 5:42 AM, Kevin Tran
You're running spark-shell. It already creates a SparkContext for you and
makes it available in a variable called "sc".
If you want to change the config of spark-shell's context, you need to use
command line option. (Or stop the existing context first, although I'm not
sure how well that will
It kinda depends on the application. Certain compression libraries, in
particular, are kinda lax with their use of off-heap buffers, so if
you configure executors to use many cores you might end up with higher
usage than the default configuration. Then there are also things like
PARQUET-118.
In
You haven't said which version of Spark you are using. The state API
only works if the underlying Spark version is also 1.6 or later.
On Mon, Aug 29, 2016 at 4:36 PM, ckanth99 wrote:
> Hi All,
>
> I have a web application which will submit spark jobs on Cloudera spark
>
I believe the Impala JDBC driver is mostly the same as the Hive
driver, but I could be wrong. In any case, the right place to ask that
question is the Impala groups (see http://impala.apache.org/).
On a side note, it is a little odd that you're trying to read data
from Impala using JDBC, instead
Yes, the 2.0 history server should be backwards compatible.
On Fri, Aug 5, 2016 at 2:14 PM, Koert Kuipers wrote:
> we have spark 1.5.x, 1.6.x and 2.0.0 job running on yarn
>
> but yarn can have only one spark history server.
>
> what to do? is it safe to use the spark 2
On Fri, Aug 5, 2016 at 9:53 AM, Carlo.Allocca wrote:
>
> org.apache.spark
> spark-core_2.10
> 2.0.0
> jar
>
>
> org.apache.spark
> spark-sql_2.10
> 2.0.0
>
The Flume connector is still available from Spark:
http://search.maven.org/#artifactdetails%7Corg.apache.spark%7Cspark-streaming-flume-assembly_2.11%7C2.0.0%7Cjar
Many of the others have indeed been removed from Spark, and can be
found at the Apache Bahir project: http://bahir.apache.org/
I
! solved !!
> But this is a bug?
> ===
> Name: cen sujun
> Mobile: 13067874572
> Mail: ce...@lotuseed.com
>
> 在 2016年7月29日,08:19,Marcelo Vanzin <van...@cloudera.com> 写道:
>
> spark.hadoop.yarn.timelin
You can probably do that in Spark's conf too:
spark.hadoop.yarn.timeline-service.enabled=false
On Thu, Jul 28, 2016 at 5:13 PM, Jeff Zhang wrote:
> One workaround is disable timeline in yarn-site,
>
> set yarn.timeline-service.enabled as false in yarn-site.xml
>
> On Thu, Jul
On Wed, Jun 22, 2016 at 1:32 PM, Mich Talebzadeh
wrote:
> Does it also depend on the number of Spark nodes involved in choosing which
> way to go?
Not really.
--
Marcelo
-
To unsubscribe, e-mail:
Trying to keep the answer short and simple...
On Wed, Jun 22, 2016 at 1:19 PM, Michael Segel
wrote:
> But this gets to the question… what are the real differences between client
> and cluster modes?
> What are the pros/cons and use cases where one has advantages over
It doesn't hurt to have a bug tracking it, in case anyone else has
time to look at it before I do.
On Mon, Jun 20, 2016 at 1:20 PM, Jonathan Kelly <jonathaka...@gmail.com> wrote:
> Thanks for the confirmation! Shall I cut a JIRA issue?
>
> On Mon, Jun 20, 2016 at 10:42 AM Marc
I just tried this locally and can see the wrong behavior you mention.
I'm running a somewhat old build of 2.0, but I'll take a look.
On Mon, Jun 20, 2016 at 7:04 AM, Jonathan Kelly wrote:
> Does anybody have any thoughts on this?
>
> On Fri, Jun 17, 2016 at 6:36 PM
nk you Marcelo. I don't know how to remove it. Could you please tell me
> how I can remove that configuration?
>
> On Mon, Jun 6, 2016 at 5:04 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> This sounds like your default Spark configuration has an
>> "
This sounds like your default Spark configuration has an
"enabledAlgorithms" config in the SSL settings, and that is listing an
algorithm name that is not available in jdk8. Either remove that
configuration (to use the JDK's default algorithm list), or change it
so that it lists algorithms
On Mon, Jun 6, 2016 at 4:22 AM, shengzhixia wrote:
> In my previous Java project I can change class loader without problem. Could
> I know why the above method couldn't change class loader in spark shell?
> Any way I can achieve it?
The spark-shell for Scala 2.10 will
On Mon, May 23, 2016 at 4:41 AM, Chandraprakash Bhagtani
wrote:
> I am passing hive-site.xml through --files option.
You need hive-site-xml in Spark's classpath too. Easiest way is to
copy / symlink hive-site.xml in your Spark's conf directory.
--
Marcelo
Hi Weifeng,
That's the Spark event log, not the YARN application log. You get the
latter using the "yarn logs" command.
On Fri, May 20, 2016 at 1:14 PM, Cui, Weifeng wrote:
> Here is the application log for this spark job.
>
> http://pastebin.com/2UJS9L4e
>
>
>
> Thanks,
>
On Thu, May 19, 2016 at 6:06 PM, Mathieu Longtin wrote:
> I'm looking to bypass the master entirely. I manage the workers outside of
> Spark. So I want to start the driver, the start workers that connect
> directly to the driver.
It should be possible to do that if you
Hi Mathieu,
There's nothing like that in Spark currently. For that, you'd need a
new cluster manager implementation that knows how to start executors
in those remote machines (e.g. by running ssh or something).
In the current master there's an interface you can implement to try
that if you
Hi Anubhav,
This is happening because you're trying to use the configuration
generated for CDH with upstream Spark. The CDH configuration will add
extra needed jars that we don't include in our build of Spark, so
you'll end up getting duplicate classes.
You can either try to use a different
ps://issues.apache.org/jira/secure/ViewProfile.jspa?name=zjffdu> added
> a comment - 26/Nov/15 08:15
>
> Marcelo Vanzin
> <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=vanzin> Is
> there any user document about it ? I didn't find it on the spark official
> site. If this
hen Boesch <java...@gmail.com> wrote:
>
> There is a committed PR from Marcelo Vanzin addressing that capability:
>
> https://github.com/apache/spark/pull/3916/files
>
> Is there any documentation on how to use this? The PR itself has two
> comments asking for the docs t
Is the class mentioned in the exception below the parent class of the
anonymous "Function" class you're creating?
If so, you may need to make it serializable. Or make your function a
proper "standalone" class (either a nested static class or a top-level
one).
On Wed, May 11, 2016 at 3:55 PM,
On Mon, May 9, 2016 at 3:34 PM, Matt Cheah wrote:
> @Marcelo: Interesting - why would this manifest on the YARN-client side
> though (as Spark is the client to YARN in this case)? Spark as a client
> shouldn’t care about what auxiliary services are on the YARN cluster.
The
Hi Jesse,
On Mon, May 9, 2016 at 2:52 PM, Jesse F Chen wrote:
> Sean - thanks. definitely related to SPARK-12154.
> Is there a way to continue use Jersey 1 for existing working environment?
The error you're getting is because of a third-party extension that
tries to talk to
See http://spark.apache.org/docs/latest/running-on-yarn.html,
especially the parts that talk about
spark.yarn.historyServer.address.
On Mon, May 2, 2016 at 2:14 PM, satish saley wrote:
>
>
> Hello,
>
> I am running pyspark job using yarn-cluster mode. I can see spark job
On Fri, Apr 22, 2016 at 10:38 AM, Mich Talebzadeh
wrote:
> I am trying to test Spark with CEP and I have been shown a sample here
>
Sorry, I've been looking at this thread and the related ones and one
thing I still don't understand is: why are you trying to use internal
Spark classes like Logging and SparkFunSuite in your code?
Unless you're writing code that lives inside Spark, you really
shouldn't be trying to reference
On Thu, Apr 14, 2016 at 2:14 PM, Benjamin Zaitlen wrote:
>> spark-submit --master yarn-cluster /home/ubuntu/test_spark.py --files
>> /home/ubuntu/localtest.txt#appSees.txt
--files should come before the path to your python script. Otherwise
it's just passed as arguments to
You can set "spark.yarn.security.tokens.hive.enabled=false" in your
config, although your app won't work if you actually need Hive
delegation tokens.
On Thu, Apr 14, 2016 at 12:21 AM, Luca Rea
wrote:
> Hi Jeff,
>
>
>
> Thank you for your support, I’ve removed
On Fri, Apr 1, 2016 at 9:23 AM, Truong Duc Kien wrote:
> I need to gather some metrics using a SparkListener. Does the callback
> methods need to thread-safe or they are always call from the same thread ?
The callbacks are all fired on the same thread. Just be careful
If you use any shuffle service before 2.0 it should be compatible with
all previous releases.
The 2.0 version has currently an incompatibility that we should
probably patch before releasing 2.0, to support this kind of use case
(among others).
On Fri, Mar 18, 2016 at 7:25 PM, Koert Kuipers
On Thu, Feb 18, 2016 at 10:26 AM, wgtmac wrote:
> In the code, I did following:
> val sc = new SparkContext(new
> SparkConf().setAppName("test").set("spark.driver.memory", "4g"))
You can't set the driver memory like this, in any deploy mode. When
that code runs, the driver is
You don't... just send a new one.
On Fri, Feb 5, 2016 at 9:33 AM, swetha kasireddy
wrote:
> Hi,
>
> I want to edit/delete a message posted in Spark User List. How do I do that?
>
> Thanks!
--
Marcelo
Without the exact error from the driver that caused the job to restart,
it's hard to tell. But a simple way to improve things is to install the
Spark shuffle service on the YARN nodes, so that even if an executor
crashes, its shuffle output is still available to other executors.
On Wed, Feb 3,
urce-allocation
>
>
>
> On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin <van...@cloudera.com>
> wrote:
>
>> Without the exact error from the driver that caused the job to restart,
>> it's hard to tell. But a simple way to improve things is to install the
>&
> Unrecognized VM option
> 'newsize=2096m,-XX:MaxPermSize=512m,-XX:+PrintGCDetails,-XX:+PrintGCTimeStamps,-XX:+UseParNewGC,-XX:+UseConcMarkSweepGC,-XX:CMSInitiatingOccupancyFraction=80,-XX:GCTimeLimit=5,-XX:GCHeapFreeLimit=95'
>
>
>
>
> From: Marcelo Vanzin
> Date: 2016
On Thu, Jan 21, 2016 at 5:42 AM, Olivier Devoisin
wrote:
> The documentation states that it contains VM overheads, interned strings and
> other native overheads. However it's really vague.
It's intentionally vague, because it's "everything that is not Java
On Wed, Jan 20, 2016 at 7:38 PM, our...@cnsuning.com
wrote:
> --driver-java-options $sparkdriverextraJavaOptions \
You need quotes around "$sparkdriverextraJavaOptions".
--
Marcelo
-
To unsubscribe,
On Thu, Jan 14, 2016 at 10:17 AM, Sanjeev Verma
wrote:
> now it spawn a single executors with 1060M size, I am not able to understand
> why this time it executes executors with 1G+overhead not 2G what I
> specified.
Where are you looking for the memory size for the
> I am looking into the web ui of spark application master(tab executors).
>
> On Fri, Jan 15, 2016 at 12:08 AM, Marcelo Vanzin <van...@cloudera.com>
> wrote:
>>
>> On Thu, Jan 14, 2016 at 10:17 AM, Sanjeev Verma
>> <sanjeev.verm...@gmail.com> wrote:
>> &
SparkSubmitDriverBootstrapper was removed back in Spark 1.4, so it
seems you have a mixbag of 1.3 / 1.6 in your path / classpath and
things are failing because of that.
On Wed, Jan 13, 2016 at 9:31 AM, Lin Zhao wrote:
> My job runs fine in yarn cluster mode but I have reason to
Try "git grep -i spark.memory.offheap.size"...
On Wed, Jan 6, 2016 at 2:45 PM, Ted Yu wrote:
> Maybe I looked in the wrong files - I searched *.scala and *.java files (in
> latest Spark 1.6.0 RC) for '.offheap.' but didn't find the config.
>
> Can someone enlighten me ?
>
>
If you're trying to compile against Scala 2.11, you're missing
"-Dscala-2.11" in that command.
On Wed, Jan 6, 2016 at 12:27 PM, Jade Liu wrote:
> Hi, Todd:
>
> Thanks for your suggestion. Yes I did run the ./dev/change-scala-version.sh
> 2.11 script when using scala version
You should be looking at the YARN RM web ui to monitor YARN
applications; that will have a link to the Spark application's UI,
along with other YARN-related information.
Also, if you run the app in client mode, it might be easier to debug
it until you know it's running properly (since you'll see
Hi Prateek,
Are you using CDH 5.5 by any chance? We fixed this bug in an upcoming
patch. Unfortunately there's no workaround at the moment... it doesn't
affect upstream Spark either.
On Fri, Dec 11, 2015 at 2:05 PM, prateek arora
wrote:
>
>
> Hi
>
> I am trying to
On Thu, Dec 17, 2015 at 3:31 PM, Vikram Kone wrote:
> No we are using standard spark w/ datastax cassandra. I'm able to see some
> json when I do http://10.1.40.16:7080/json/v1/applications
> but getting the following errors when I do
>
m attaching all container logs. can you please take a look at it when you
> get a chance.
>
> Thanks
> Prasad
>
> On Sat, Dec 5, 2015 at 2:30 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> On Fri, Dec 4, 2015 at 5:47 PM, prasadreddy <alle.re...@gma
Hi Prasad, please reply to the list so that others can benefit / help.
On Sat, Dec 5, 2015 at 4:06 PM, Prasad Reddy wrote:
> Have you had a chance to try this authentication for any of your projects
> earlier.
Yes, we run with authenticate=true by default. It works fine.
On Fri, Dec 4, 2015 at 5:47 PM, prasadreddy wrote:
> I am running Spark YARN and trying to enable authentication by setting
> spark.authenticate=true. After enable authentication I am not able to Run
> Spark word count or any other programs.
Define "I am not able to run".
(bcc: user@spark, since this is Hive code.)
You're probably including unneeded Spark jars in Hive's classpath
somehow. Either the whole assembly or spark-hive, both of which will
contain Hive classes, and in this case contain old versions that
conflict with the version of Hive you're running.
On
On Thu, Dec 3, 2015 at 10:32 AM, Mich Talebzadeh wrote:
> hduser@rhes564::/usr/lib/spark/logs> hive --version
> SLF4J: Found binding in
> [jar:file:/usr/lib/spark/lib/spark-assembly-1.3.0-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
As I suggested before, you
On Tue, Dec 1, 2015 at 12:45 PM, Charles Allen
wrote:
> Is there a way to pass configuration file resources to be resolvable through
> the classloader?
Not in general. If you're using YARN, you can cheat and use
"spark.yarn.dist.files" which will place those files
On Tue, Dec 1, 2015 at 9:43 PM, Anfernee Xu wrote:
> But I have a single server(JVM) that is creating SparkContext, are you
> saying Spark supports multiple SparkContext in the same JVM? Could you
> please clarify on this?
I'm confused. Nothing you said so far requires
On Tue, Dec 1, 2015 at 3:32 PM, Anfernee Xu wrote:
> I have a long running backend server where I will create a short-lived Spark
> job in response to each user request, base on the fact that by default
> multiple Spark Context cannot be created in the same JVM, looks like
On Mon, Nov 23, 2015 at 6:24 PM, gpriestley wrote:
> Questions I have are:
> 1) How does the spark.yarn.am.port relate to defined ports within Spark
> (driver, executor, block manager, etc.)?
> 2) Doe the spark.yarn.am.port parameter only relate to the spark
>
We've had this in the past when using "@VisibleForTesting" in classes
that for some reason the shell tries to process. QueryExecution.scala
seems to use that annotation and that was added recently, so that's
probably the issue.
BTW, if anyone knows how Scala can find a reference to the original
On Mon, Nov 9, 2015 at 5:54 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> If there is no option to let shell skip processing @VisibleForTesting ,
> should the annotation be dropped ?
That's what we did last time this showed up.
> On Mon, Nov 9, 2015 at 5:50 PM, Marcelo Vanzin <v
On Thu, Nov 5, 2015 at 3:41 PM, Joey Paskhay wrote:
> We verified the Guava libraries are in the huge list of the included jars,
> but we saw that in the
> org.apache.spark.sql.hive.client.IsolatedClientLoader.isSharedClass method
> it seems to assume that *all*
Resources belong to the application, not each job, so the latter.
On Wed, Nov 4, 2015 at 9:24 AM, Nisrina Luthfiyati
wrote:
> Hi all,
>
> I'm running some spark jobs in java on top of YARN by submitting one
> application jar that starts multiple jobs.
> My question
Hi, your question is really CM-related and not Spark-related, so I'm
bcc'ing the list and will reply separately.
On Tue, Nov 3, 2015 at 11:08 AM, billou2k wrote:
> Hi,
> Sorry this is probably a silly question but
> I have a standard CDH 5.4.2 config with Spark 1.3 and
You can try the "--proxy-user" command line argument for spark-submit.
That requires that your RM configuration allows the user running your
AM to "proxy" other users. And I'm not completely sure it works
without Kerberos.
See:
On Tue, Oct 27, 2015 at 10:43 AM, Jerry Lam wrote:
> Anyone experiences issues in setting hadoop configurations after
> SparkContext is initialized? I'm using Spark 1.5.1.
>
> I'm trying to use s3a which requires access and secret key set into hadoop
> configuration. I tried
Best Regards,
>
> Jerry
>
>
> On Tue, Oct 27, 2015 at 2:05 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> On Tue, Oct 27, 2015 at 10:43 AM, Jerry Lam <chiling...@gmail.com> wrote:
>> > Anyone experiences issues in setting hadoop configurations
On Wed, Oct 14, 2015 at 10:01 AM, Florian Kaspar
wrote:
> we are working on a project running on Spark. Currently we connect to a
> remote Spark-Cluster in Standalone mode to obtain the SparkContext using
>
> new JavaSparkContext(new
>
On Wed, Oct 14, 2015 at 10:29 AM, Florian Kaspar wrote:
> so it is possible to simply copy the YARN configuration from the remote
> cluster to the local machine (assuming, the local machine can resolve the
> YARN host etc.) and just letting Spark do the rest?
>
Yes,
It would probably be more helpful if you looked for the executor error and
posted it. The screenshot you posted is the driver exception caused by the
task failure, which is not terribly useful.
On Tue, Oct 13, 2015 at 7:23 AM, wrote:
> Has anyone tried shuffle
tty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
>
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>
>
arkSubmit.scala:193)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> On 6 October 2015 at 16:20, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> On Tue, Oct
On Tue, Oct 6, 2015 at 12:04 PM, Gary Ogden wrote:
> But we run unit tests differently in our build environment, which is
> throwing the error. It's setup like this:
>
> I suspect this is what you were referring to when you said I have a problem?
Yes, that is what I was
On Tue, Oct 6, 2015 at 5:57 AM, oggie wrote:
> We have a Java app written with spark 1.3.1. That app also uses Jersey 2.9
> client to make external calls. We see spark 1.4.1 uses Jersey 1.9.
How is this app deployed? If it's run via spark-submit, you could use
You're mixing app scheduling in the cluster manager (your [1] link)
with job scheduling within an app (your [2] link). They're independent
things.
On Fri, Oct 2, 2015 at 2:22 PM, Jacek Laskowski wrote:
> Hi,
>
> The docs in Resource Scheduling [1] says:
>
>> The standalone
On Fri, Oct 2, 2015 at 5:29 PM, Jacek Laskowski wrote:
>> The standalone cluster mode currently only supports a simple FIFO scheduler
>> across applications.
>
> is correct or not? :(
I think so. But, because they're different things, that does not mean
you cannot use a fair
How are you running the actual application?
I find it slightly odd that you're setting PYSPARK_SUBMIT_ARGS
directly; that's supposed to be an internal env variable used by
Spark. You'd normally pass those parameters in the spark-submit (or
pyspark) command line.
On Thu, Oct 1, 2015 at 8:56 AM,
If you want to process the data locally, why do you need to use sc.parallelize?
Store the data in regular Scala collections and use their methods to
process them (they have pretty much the same set of methods as Spark
RDDs). Then when you're happy, finally use Spark to process the
pre-processed
(-dev@)
Try using the "yarn logs" command to read logs for finished
applications. You can also browse the RM UI to find more information
about the applications you ran.
On Mon, Sep 28, 2015 at 11:37 PM, Rachana Srivastava
wrote:
> Hello all,
>
>
>
> I am
101 - 200 of 482 matches
Mail list logo