Hi Sanket,
Driver and executor logs are written to stdout by default, it can be
configured using SPARK_HOME/conf/log4j.properties file. The file including
the entire SPARK_HOME/conf is auto propogateded to all driver and executor
container and mounted as volume.
Thanks
On Mon, 9 Oct, 2023, 5:37
Hi Sanket, more details might help here.
How does your spark configuration look like?
What exactly was done when this happened?
On Thu, 5 Oct, 2023, 2:29 pm Agrawal, Sanket,
wrote:
> Hello Everyone,
>
>
>
> We are trying to stream the changes in our Iceberg tables stored in AWS
> S3. We are
Hi Sachit,
The fix verison on that JIRA says 3.0.2, so this fix is not yet released.
Soon, there will be a 3.1.1 release, in the meantime you can try out the
3.1.1-rc which also has the fix and let us know your findings.
Thanks,
On Mon, Feb 1, 2021 at 10:24 AM Sachit Murarka
wrote:
>
A lot of developers may have already moved to 3.0.x, FYI 3.1.0 is just
around the corner hopefully(in a few days) and has a lot of improvements to
spark on K8s, including it will be transitioning from experimental to GA in
this release.
See: https://issues.apache.org/jira/browse/SPARK-33005
ate.driver.serviceAccountName=spark-sa --conf
> spark.kubernetes.container.image=sparkpy local:///opt/spark/da/main.py
>
> Kind Regards,
> Sachit Murarka
>
>
> On Mon, Jan 4, 2021 at 5:46 PM Prashant Sharma
> wrote:
>
>> Hi Sachit,
>>
>> Can you give more details on how did you
Hi Sachit,
Can you give more details on how did you run? i.e. spark submit command. My
guess is, a service account with sufficient privilege is not provided.
Please see:
http://spark.apache.org/docs/latest/running-on-kubernetes.html#rbac
Thanks,
On Mon, Jan 4, 2021 at 5:27 PM Sachit Murarka
-dev
Hi,
I have used Spark with HDFS encrypted with Hadoop KMS, and it worked well.
Somehow, I could not recall, if I had the kubernetes in the mix. Somehow,
seeing the error, it is not clear what caused the failure. Can I reproduce
this somehow?
Thanks,
On Sat, Aug 15, 2020 at 7:18 PM Michel
Hi Ashika,
Hadoop 2.6 is now no longer supported, and since it has not been maintained
in the last 2 years, it means it may have some security issues unpatched.
Spark 3.0 onwards, we no longer support it, in other words, we have
modified our codebase in a way that Hadoop 2.6 won't work. However,
Hi Ankur,
Java 11 support was added in Spark 3.0.
https://issues.apache.org/jira/browse/SPARK-24417
Thanks,
On Tue, Jul 14, 2020 at 6:12 PM Ankur Mittal
wrote:
> Hi,
>
> I am using Spark 2.X and need to execute Java 11 .Its not able to execute
> Java 11 using Spark 2.X.
>
> Is there any way
> scalable and dynamic-allocation-enabled for deploying Spark on K8s? Any
> suggested github repo or link?
>
>
>
> Thanks,
>
> Vaibhav V
>
>
>
>
>
> *From:* Prashant Sharma
> *Sent:* Friday, July 10, 2020 12:57 AM
> *To:* user@spark.apache.org
Hi,
Whether it is a blocker or not, is upto you to decide. But, spark k8s
cluster supports dynamic allocation, through a different mechanism, that
is, without using an external shuffle service.
https://issues.apache.org/jira/browse/SPARK-27963. There are pros and cons
of both approaches. The only
Hi,
My employer(IBM) is interested in hiring people in hyderabad if they are
committers in any of the Apache Projects and are interested Spark and
ecosystem.
Thanks,
Prashant.
I have a Spark Streaming job which takes too long to delete temp RDD's. I
collect about 4MM telemetry metrics per minute and do minor aggregations in
the Streaming Job.
I am using Amazon R4 instances. The Driver RPC call although Async,i
believe, is slow getting the handle for future object at
Hi Darshan,
Did you try passing the config directly as an option, like this:
.option("kafka.sasl.jaas.config", saslConfig)
Where saslConfig can look like:
com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
Hi,
Goal of my benchmark is to arrive at end to end latency lower than 100ms
and sustain them over time, by consuming from a kafka topic and writing
back to another kafka topic using Spark. Since the job does not do
aggregation and does a constant time processing on each message, it
appeared to
+user -dev
Since the same hash based partitioner is in action by default. In my
understanding every time same partitioning will happen.
Thanks,
On Nov 10, 2016 7:13 PM, "WangJianfei"
wrote:
> Hi Devs:
> If i run sc.textFile(path,xxx) many times, will the
Hi Baahu,
That should not be a problem, given you allocate sufficient buffer for
reading.
I was just working on implementing a patch[1] to support the feature for
reading wholetextfiles in SQL. This can actually be slightly better
approach, because here we read to offheap memory for holding
Since you are reading from file stream, I would suggest instead of printing
try to save it on a file. There may be output the first time and then no
data in subsequent iterations.
Prashant Sharma
On Tue, Apr 26, 2016 at 7:40 PM, Ashutosh Kumar <kmr.ashutos...@gmail.com>
wrote:
>
is one such formatter class.
thanks,
Prashant Sharma
On Wed, Apr 27, 2016 at 5:22 AM, Davies Liu <dav...@databricks.com> wrote:
> hdfs://192.168.10.130:9000/dev/output/test already exists, so you need
> to remove it first.
>
> On Tue, Apr 26, 2016 at 5:28 AM, Luke Adolph &l
As far as I can understand, your requirements are pretty straight forward
and doable with just simple SQL queries. Take a look at Spark SQL on spark
documentation.
Prashant Sharma
On Tue, Apr 12, 2016 at 8:13 PM, Joe San <codeintheo...@gmail.com> wrote:
> up vote
> down votefavo
This can happen if system time is not in sync. By default, streaming uses
SystemClock(it also supports ManualClock) and that relies
on System.currentTimeMillis() for determining start time.
Prashant Sharma
On Sat, Apr 16, 2016 at 10:09 PM, Hemalatha A <
hemalatha.amru...@googlemail.com>
May be you can try creating it before running the App.
and
xml[1] messages.
Thanks,
Prashant Sharma
1. https://github.com/databricks/spark-xml
On Tue, Apr 19, 2016 at 10:31 AM, Deepak Sharma <deepakmc...@gmail.com>
wrote:
> Hi all,
> I am looking for an architecture to ingest 10 mils of messages in the
> micro batches of seconds.
> If
*This is a known issue. *
https://issues.apache.org/jira/browse/SPARK-3200
Prashant Sharma
On Thu, Mar 3, 2016 at 9:01 AM, Rahul Palamuttam <rahulpala...@gmail.com>
wrote:
> Thank you Jeff.
>
> I have filed a JIRA under the following link :
>
> https://issues.apache.
) are
planning to work, I can help you ?
Prashant Sharma
On Thu, Apr 9, 2015 at 3:08 PM, anakos ana...@gmail.com wrote:
Hi-
I am having difficulty getting the 1.3.0 Spark shell to find an external
jar. I have build Spark locally for Scala 2.11 and I am starting the REPL
as follows:
bin/spark
Hi Folks,
We are trying to run the following code from the spark shell in a CDH 5.3
cluster running on RHEL 5.8.
*spark-shell --master yarn --deploy-mode client --num-executors 15
--executor-cores 6 --executor-memory 12G *
*import org.apache.spark.mllib.recommendation.ALS *
*import
, That is just a warning. FYI spark ignores BindException and probes
for next available port and continues. So you application is fine if that
particular error comes up.
Prashant Sharma
On Tue, Jan 20, 2015 at 10:30 AM, Deep Pradhan pradhandeep1...@gmail.com
wrote:
Yes, I have increased the driver
/patch-3/docs/building-spark.md
Prashant Sharma
On Tue, Nov 18, 2014 at 12:19 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Any notable issues for using Scala 2.11? Is it stable now?
Or can I use Scala 2.11 in my spark application and use Spark dist build
with 2.10 ?
I'm looking forward
Looks like sbt/sbt -Pscala-2.11 is broken by a recent patch for improving
maven build.
Prashant Sharma
On Tue, Nov 18, 2014 at 12:57 PM, Prashant Sharma scrapco...@gmail.com
wrote:
It is safe in the sense we would help you with the fix if you run into
issues. I have used it, but since I
spray depends on and use the akka spark depends on.
Prashant Sharma
On Wed, Oct 29, 2014 at 9:27 AM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
I'm using Spark built from HEAD, I think it uses modified Akka 2.3.4,
right?
Jianshi
On Wed, Oct 29, 2014 at 5:53 AM, Mohammed Guller moham
What is the motivation behind this ?
You can start with master as local[NO_OF_THREADS]. Reducing the threads at
all other places can have unexpected results. Take a look at this.
http://spark.apache.org/docs/latest/configuration.html.
Prashant Sharma
On Tue, Oct 28, 2014 at 2:08 PM, Wanda
Are you doing this in REPL ? Then there is a bug filed for this, I just
can't recall the bug ID at the moment.
Prashant Sharma
On Fri, Oct 24, 2014 at 4:07 AM, Niklas Wilcke
1wil...@informatik.uni-hamburg.de wrote:
Hi Jao,
I don't really know why this doesn't work but I have two hints
[Removing dev lists]
You are absolutely correct about that.
Prashant Sharma
On Tue, Oct 14, 2014 at 5:03 PM, Priya Ch learnings.chitt...@gmail.com
wrote:
Hi Spark users/experts,
In Spark source code (Master.scala Worker.scala), when registering the
worker with master, I see the usage
So if you need those features you can go ahead and setup one of Filesystem
or zookeeper options. Please take a look at:
http://spark.apache.org/docs/latest/spark-standalone.html.
Prashant Sharma
On Wed, Oct 15, 2014 at 3:25 PM, Chitturi Padma
learnings.chitt...@gmail.com wrote:
which means
What is your spark version ? This was fixed I suppose. Can you try it with
latest release ?
Prashant Sharma
On Fri, Sep 12, 2014 at 9:47 PM, Ramaraju Indukuri iramar...@gmail.com
wrote:
This is only a problem in shell, but works fine in batch mode though. I am
also interested in how others
Hey,
You can use spark-shell -i sparkrc, to do this.
Prashant Sharma
On Wed, Sep 3, 2014 at 2:17 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
To make my shell experience merrier, I need to import several packages,
and define implicit sparkContext and sqlContext.
Is there a startup
-framework/chill-akka) might help. I am not well
aware about how kryo works internally, may be someone else can throw some
light on this.
Prashant Sharma
On Sat, Jul 26, 2014 at 6:26 AM, Alan Ngai a...@opsclarity.com wrote:
The stack trace was from running the Actor count sample directly
it is kinda fast to do either tag prediction
at point which is not accurate etc.. but its useful.
Incase you are working on building this(inferior mode for spark repl) for
us, I can come up with a wishlist.
Prashant Sharma
On Sat, Jul 26, 2014 at 3:07 AM, Andrei faithlessfri...@gmail.com wrote:
I
Hi,
What is your Zeromq version ? It is known to work well with 2.2
an output of `sudo ldconfig -v | grep zmq` would helpful in this regard.
Thanks
Prashant Sharma
On Wed, Jun 4, 2014 at 11:40 AM, Tobias Pfeiffer t...@preferred.jp wrote:
Hi,
I am trying to use Spark Streaming (1.0.0
%3DGJh1g2zxOJd02Wt7L06mCLjo-vwwG9Q%40mail.gmail.com%3E
Prashant Sharma
On Fri, May 2, 2014 at 3:56 PM, N.Venkata Naga Ravi nvn_r...@hotmail.comwrote:
Hi,
I am tyring to build Apache Spark with Java 8 in my Mac system ( OS X
10.8.5) , but getting following exception.
Please help on resolving
I have pasted the link in my previous post.
Prashant Sharma
On Fri, May 2, 2014 at 4:15 PM, N.Venkata Naga Ravi nvn_r...@hotmail.comwrote:
Thanks for your quick replay.
I tried with fresh installation, it downloads sbt 0.12.4 only (please
check below logs). So it is not working. Can you
I had like to be corrected on this but I am just trying to say small enough
of the order of few 100 MBs. Imagine the size gets shipped to all nodes, it
can be a GB but not GBs and then depends on the network too.
Prashant Sharma
On Fri, May 2, 2014 at 6:42 PM, Diana Carroll dcarr
with zeromq 2.2.0 and if you
have jzmq libraries installed performance is much better.
Prashant Sharma
On Tue, Apr 29, 2014 at 12:29 PM, Francis.Hu
francis...@reachjunction.comwrote:
Hi, all
I installed spark-0.9.1 and zeromq 4.0.1 , and then run below example:
./bin/run-example
Well that is not going to be easy, simply because we depend on akka-zeromq
for zeromq support. And since akka does not support the latest zeromq
library yet, I doubt if there is something simple that can be done to
support it.
Prashant Sharma
On Tue, Apr 29, 2014 at 2:44 PM, Francis.Hu francis
Prashant Sharma
On Thu, Apr 24, 2014 at 12:15 PM, Carter gyz...@hotmail.com wrote:
Thanks Mayur.
So without Hadoop and any other distributed file systems, by running:
val doc = sc.textFile(/home/scalatest.txt,5)
doc.count
we can only get parallelization within the computer where
It is the same file and hadoop library that we use for splitting takes care
of assigning the right split to each node.
Prashant Sharma
On Thu, Apr 24, 2014 at 1:36 PM, Carter gyz...@hotmail.com wrote:
Thank you very much for your help Prashant.
Sorry I still have another question about your
I think Mahout uses FuzzyKmeans, which is different algorithm and it is not
iterative.
Prashant Sharma
On Tue, Mar 25, 2014 at 6:50 PM, Egor Pahomov pahomov.e...@gmail.comwrote:
Hi, I'm running benchmark, which compares Mahout and SparkML. For now I
have next results for k-means:
Number
47 matches
Mail list logo