?
Regards,
Raghava.
On Wed, Jul 20, 2016 at 2:08 AM, Saurav Sinha
wrote:
> Hi,
>
> I have set driver memory 10 GB and job ran with intermediate failure which
> is recovered back by spark.
>
> But I still what to know if no of parts increases git driver ram need to
> be
Thank you. Sure, if I find something I will post it.
Regards,
Raghava.
On Wed, Jun 22, 2016 at 7:43 PM, Nirav Patel wrote:
> I believe it would be task, partitions, task status etc information. I do
> not know exact of those things but I had OOM on driver with 512MB and
> increasi
available limit. So the other options are
1) Separate the driver from master, i.e., run them on two separate nodes
2) Increase the RAM capacity on the driver/master node.
Regards,
Raghava.
On Wed, Jun 22, 2016 at 7:05 PM, Nirav Patel wrote:
> Yes driver keeps fair amount of meta data to man
them to T, i.e., T = T + deltaT
3) Stop when current T size (count) is same as previous T size, i.e.,
deltaT is 0.
Do you think something happens on the driver due to the application logic
and when the partitions are increased?
Regards,
Raghava.
On Wed, Jun 22, 2016 at 12:33 PM, Sonal Goyal
uld be the possible reasons behind the driver-side OOM when the
number of partitions are increased?
Regards,
Raghava.
(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
On Fri, May 13, 2016 at 6:33 AM, Raghava Mutharaju <
m.vijayaragh...@gmail.com> wrote:
> Thank you for the response.
>
> I use
= "org.apache.spark" % "spark-sql_2.11" % "2.0.0-SNAPSHOT"
lazy val root = (project in file(".")).
settings(
name := "sparkel",
version := "0.1.0",
scalaVersion := "2.11.8",
libraryDependencies += spark,
library
tting of spark version gives sbt error
unresolved dependency: org.apache.spark#spark-core_2.11;2.0.0-SNAPSHOT
I guess this is because the repository doesn't contain 2.0.0-SNAPSHOT. Does
this mean, the only option is to put all the required jars in the lib
folder (unmanaged dependencies)?
Regards,
Raghava.
nto account that both RDDs are already hash
partitioned.
Regards,
Raghava.
On Tue, May 10, 2016 at 11:44 AM, Rishi Mishra
wrote:
> As you have same partitioner and number of partitions probably you can use
> zipPartition and provide a user defined function to substract .
>
> A very primiti
me(16)))
(3,(16,Some(30)))
(3,(16,Some(16)))
case (x, (y, z)) => Apart from allowing z == None and filtering on y == z,
we also should filter out (3, (16, Some(30))). How can we do that
efficiently without resorting to broadcast of any elements of rdd2?
Regards,
Raghava.
On Mon, May 9, 2016
.
Regards,
Raghava.
use Spark 1.6.0
We noticed the following
1) persisting an RDD seems to lead to unbalanced distribution of partitions
across the executors.
2) If one RDD has an all-nothing skew then rest of the RDDs that depend on
it also get all-nothing skew.
Regards,
Raghava.
On Wed, Apr 27, 2016 at 10:20 AM
ppens when count is moved).
Any pointers in figuring out this issue is much appreciated.
Regards,
Raghava.
On Fri, Apr 22, 2016 at 7:40 PM, Mike Hynes <91m...@gmail.com> wrote:
> Glad to hear that the problem was solvable! I have not seen delays of this
> type for later stages in j
Thank you. For now we plan to use spark-shell to submit jobs.
Regards,
Raghava.
On Fri, Apr 22, 2016 at 7:40 PM, Mike Hynes <91m...@gmail.com> wrote:
> Glad to hear that the problem was solvable! I have not seen delays of this
> type for later stages in jobs run by spark-submit,
stage also.
Apart from introducing a dummy stage or running it from spark-shell, is
there any other option to fix this?
Regards,
Raghava.
On Mon, Apr 18, 2016 at 12:17 AM, Mike Hynes <91m...@gmail.com> wrote:
> When submitting a job with spark-submit, I've observed delays (up to
No. We specify it as a configuration option to the spark-submit. Does that
make a difference?
Regards,
Raghava.
On Mon, Apr 18, 2016 at 9:56 AM, Sonal Goyal wrote:
> Are you specifying your spark master in the scala program?
>
> Best Regards,
> Sonal
> Founder, Nube Tec
all the data is on one node and nothing on
the other and no, the keys are not the same. They vary from 1 to around
55000 (integers). What makes this strange is that it seems to work fine on
the spark shell (REPL).
Regards,
Raghava.
On Mon, Apr 18, 2016 at 1:14 AM, Mike Hynes <91m...@gmail.
size (which is more than adequate now). This behavior is different in
spark-shell and spark scala program.
We are not using YARN, its the stand alone version of Spark.
Regards,
Raghava.
On Mon, Apr 18, 2016 at 12:09 AM, Anuj Kumar wrote:
> Few params like- spark.task.cpus, spark.cores.
tainedJobs and retainedStages has been increased to check them
in the UI.
What information regarding Spark Context would be of interest here?
Regards,
Raghava.
On Sun, Apr 17, 2016 at 10:54 PM, Anuj Kumar wrote:
> If the data file is same then it should have similar distribution of keys.
this
behavior does not change. This seems strange.
Is there some problem with the way we use HashPartitioner?
Thanks in advance.
Regards,
Raghava.
][] in Scala?
Does this point to some other issue?
In some other posts, I noticed use of kryo.register(). In this case, how do
we pass the kryo object to SparkContext?
Thanks in advance.
Regards,
Raghava.
(org.apache.spark.sql.types.StructField[].class);
I tried registering
using conf.registerKryoClasses(Array(classOf[StructField[]]))
But StructField[] does not exist. Is there any other way to register it? I
already registered StructField.
Regards,
Raghava.
Thanks a lot Ted.
If the two columns are of different types say Int and Long, then will be
ds.select(expr("_2 / _1").as[(Int, Long)])
Regards,
Raghava.
On Wed, Feb 10, 2016 at 5:19 PM, Ted Yu wrote:
> bq. I followed something similar $"a.x"
>
> Please use expr(
Ted,
Thank you for the pointer. That works, but what does a string prepended
with $ sign mean? Is it an expression?
Could you also help me with the select() parameter syntax? I followed
something similar $"a.x" and it gives an error message that a TypedColumn
is expected.
Regards,
Rag
uot;x") == B.toDF().col("y"))
Is there a way to avoid using toDF()?
I am having similar issues with the usage of filter(A.x == B.y)
--
Regards,
Raghava
/stages?
Thanks in advance.
Raghava.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/DAG-visualization-no-visualization-information-available-with-history-server-tp26117.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
Hello All,
I am new to Spark and I am trying to understand how iterative application of
operations are handled in Spark. Consider the following program in Scala.
var u = sc.textFile(args(0)+"s1.txt").map(line => {
line.split("\\|") match { case Array(x,y) =>
(y.toInt,x.toInt)}})
27 matches
Mail list logo