1.5TB is incredible high. It doesn't seem to be a configuration problem. Could
you paste the code snippet doing the loop and join task on the dataset?
Best regards,
From: rachmaninovquartet
Sent: Thursday, April 13, 2017 10:08:40
Hi,
I have a spark 1.6.2 app (tested previously in 2.0.0 as well). It is
requiring a ton of memory (1.5TB) for a small dataset (~500mb). The memory
usage seems to jump, when I loop through and inner join to make the dataset
12 times as wide. The app goes down during or after this loop, when I try
Hey all,
I was wondering if anyone could point me where to start debugging the
following error:
ERROR Dropping SparkListenerEvent because no remaining room in event
queue. This likely means one of the SparkListeners is too slow and
cannot keep up with the rate at which tasks are being started by
hi can ask you to give me example (complete) where :
you use udf multiple time one after one and cache after that your data
frame or you checkpoint dataframe according to appropriate steps (cache or
checkpoint)
thanks
Hi ,
the number of columns that spark can handle without fuss
regards
Hi ,
i wonder if we have solution to correct code after getting stackoverflow
error
i mean you have
df.<- transformation 1
df.<- transformation 12
df.<- transformation 3
df.<- transformation 4
.
.
.
df.<- transformation 1n
and :
df.<- transformation n+1 get error stack overflow error how
Hi users,
I got this error "java.io.InvalidClassException:
org.apache.commons.lang3.time.FastDateParser; local class incompatible: stream
classdesc serialVersionUID = 3, local class serialVersionUID = 2” when run a
spark application to read from and write to a cvs file.
my spark
Hi,
what Spark version are you using?
Did you register the UDF?
How are you using the UDF?
Does the UDF support that data type as parameter?
What I do with Spark 2.0 is
-Create the UDF for each dataType I need
-Register the UDF to sparkContext
-I use UDF over dataFrame not RDD, you can convert it
Hi ,
somone can explain me how i can use inPYSPAK not in scala chekpoint ,
Because i have lot of udf to apply on large data frame and i dont
understand how i can use checkpoint to break lineag to prevent from
java.lang.stackoverflow
regrads
Looks like your udf expects numeric data but you are sending string type.
Suggest to cast to numeric.
On Thu, 13 Apr 2017 at 7:03 pm, issues solution
wrote:
> Hi
> I am newer in spark and i want ask you what wrang with checkpoint On
> pyspark 1.6.0
>
> i dont
hi
what kind of orgine of this error ???
java.lang.UnsupportedOperationException: Cannot evaluate expression:
PythonUDF#Grappra(input[410, StringType])
regrads
Hi All,
I have registered temp tables using hive context and sql context both.
Now when i try to join these 2 temp tables , 1 of the tables complain about
not being found.
Is there any setting or option so the tables in these 2 different contexts
are visible to each other?
--
Thanks
Deepak
Hi
I am newer in spark and i want ask you what wrang with checkpoint On
pyspark 1.6.0
i dont unertsand what happen after i try to use it under datframe :
dfTotaleNormalize24 = dfTotaleNormalize23.select([i if i not in
listrapcot else udf_Grappra(F.col(i)).alias(i) for i in
There is a JIRA and prototype which analyzes the JVM bytecode in the black
box, and convert the closures into catalyst expressions.
https://issues.apache.org/jira/browse/SPARK-14083
This potentially can address the issue discussed here.
Sincerely,
DB Tsai
15 matches
Mail list logo