Hi Raj,
What you need seems to be an event based initiation of a DStream. Have not seen
one yet. There are many types of DStreams that Spark implements. You can also
implement your own. InputDStream is a close match for your requirement.
See this for the available options with InputDStream:
Using sc.textFile will also read the file from HDFS one by one line through
iterator, don't need to fit all into memory, even you have small size of
memory, it still can be worked.
2015-06-12 13:19 GMT+08:00 SLiZn Liu sliznmail...@gmail.com:
Hmm, you have a good point. So should I load the file
Thanks for the extra details and explanations Chaocai, will try to
reproduce this when I got chance.
Cheng
On 6/12/15 3:44 PM, 姜超才 wrote:
I said OOM occurred on slave node, because I monitored memory
utilization during the query task, on driver, very few memory was
ocupied. And i remember i
Not sure if Spark RDD will provide API to fetch the record one by one from the
final result set, instead of the pulling them all / (or whole partition data)
and fit in the driver memory.
Seems a big change.
From: Cheng Lian [mailto:l...@databricks.com]
Sent: Friday, June 12, 2015 3:51 PM
To:
My guess the reason why local mode is OK while standalone cluster
doesn't work is that in cluster mode, task results are serialized and
sent to driver side. Driver need to deserialize the result, and thus
occupies much more memory then local mode (where task result
de/serialization is not
Is it a lot of data that is expected to come through stdin? I mean is it
even worth parallelizing the computation using something like Spark
Streaming?
On Thu, Jun 11, 2015 at 9:56 PM, Heath Guo heath...@fb.com wrote:
Thanks for your reply! In my use case, it would be stream from only one
Hi Chaocai,
Glad that 1.4 fixes your case. However, I'm a bit confused by your last
comment saying The OOM or lose heartbeat was occurred on slave node.
Because from the log files you attached at first, those OOM actually
happens on driver side (Thrift server log only contains log lines from
Not sure if Spark Core will provide API to fetch the record one by one from the
block manager, instead of the pulling them all into the driver memory.
From: Cheng Lian [mailto:l...@databricks.com]
Sent: Friday, June 12, 2015 3:51 PM
To: 姜超才; Hester wang; user@spark.apache.org
Subject: Re: 回复:
It was released yesterday.
On Friday, June 12, 2015, ayan guha guha.a...@gmail.com wrote:
Hi
When is official spark 1.4 release date?
Best
Ayan
Scala KafkaRDD uses a trait to handle this problem, but it is not so easy
and straightforward in Python, where we need to have a specific API to
handle this, I'm not sure is there any simple workaround to fix this, maybe
we should think carefully about it.
2015-06-12 13:59 GMT+08:00 Amit Ramesh
This is a quick question about Checkpoint. The question is: if the
StreamingContext is not stopped gracefully, will the checkpoint be
consistent?
Or I should always gracefully shutdown the application even in order to
use the checkpoint?
Thank you very much!
Hi
When is official spark 1.4 release date?
Best
Ayan
Yes, it is lots of data, and the utility I'm working with prints out infinite
real time data stream. Thanks.
From: Tathagata Das t...@databricks.commailto:t...@databricks.com
Date: Thursday, June 11, 2015 at 11:43 PM
To: Heath Guo heath...@fb.commailto:heath...@fb.com
Cc: user
Would using the socketTextStream and `yourApp | nc -lk port` work?? Not
sure how resilient the socket receiver is though. I've been playing with it
for a little demo and I don't understand yet its reconnection behavior.
Although I would think that putting some elastic buffer in between would be
a
On 6/10/15 8:53 PM, Bipin Nag wrote:
Hi Cheng,
I am using Spark 1.3.1 binary available for Hadoop 2.6. I am loading
an existing parquet file, then repartitioning and saving it. Doing
this gives the error. The code for this doesn't look like causing
problem. I have a feeling the source -
Hi,
I have created a Custom Receiver in Java which receives data from Websphere MQ
and I am only writing the received records on HDFS.
I have referred many forums for optimizing speed of spark streaming
application. Here I am listing a few:
* Spark
Hi, I'm using Spark Streaming(1.3.1).
I want to get exactly-once messaging from Kafka and use Window operations of
DStraem,
When Window operations(eg DStream#reduceByKeyAndWindow) with kafka
Direct-API
java.lang.ClassCastException occurs as follows.
--- stacktrace --
At the time 1.3.x was released, 1.6.0 hasn't been released yet. And we
didn't have enough time to upgrade and test Parquet 1.6.0 for Spark
1.4.0. But we've already upgraded Parquet to 1.7.0 (which is exactly the
same as 1.6.0 with package name renamed from com.twitter to
org.apache.parquet) on
Hi Cheng,
Yes, some rows contain unit instead of decimal values. I believe some rows
from original source I had don't have any value i.e. it is null. And that
shows up as unit. How does the spark-sql or parquet handle null in place of
decimal values, assuming that field is nullable. I will have
Thanks a lot.
On 12 Jun 2015 19:46, Todd Nist tsind...@gmail.com wrote:
It was released yesterday.
On Friday, June 12, 2015, ayan guha guha.a...@gmail.com wrote:
Hi
When is official spark 1.4 release date?
Best
Ayan
Here is a spark 1.4 release blog by data bricks.
https://databricks.com/blog/2015/06/11/announcing-apache-spark-1-4.html
https://databricks.com/blog/2015/06/11/announcing-apache-spark-1-4.html
Guru Medasani
gdm...@gmail.com
On Jun 12, 2015, at 7:08 AM, ayan guha guha.a...@gmail.com wrote:
Hi
What is the reason that Spark still comes with Parquet 1.6.0rc3? It seems like
newer Parquet versions are available (e.g. 1.6.0). This would fix problems with
‘spark.sql.parquet.filterPushdown’, which currently is disabled by default,
because of a bug in Parquet 1.6.0rc3.
Thanks!
Eric
Great, thanks for the extra info!
On 12 Jun 2015, at 12:41, Cheng Lian lian.cs@gmail.com wrote:
At the time 1.3.x was released, 1.6.0 hasn't been released yet. And we didn't
have enough time to upgrade and test Parquet 1.6.0 for Spark 1.4.0. But we've
already upgraded Parquet to
Hi Shao,
Thank you for your quick prompt.
I was disappointed.
I will try window operations with Receiver-based
Approach(KafkaUtils.createStream).
Thank you again,
ZIGEN
2015/06/12 17:18、Saisai Shao sai.sai.s...@gmail.com のメッセージ:
I think you could not use offsetRange in such way, when you
I would like to know if Spark has any facility by which particular tasks
can be scheduled to run on chosen nodes.
The use case: we have a large custom-format database. It is partitioned
and the segments are stored on local SSD on multiple nodes. Incoming
queries are matched against the
Cody,
On 12 Jun 2015, at 17:26, Cody Koeninger c...@koeninger.org wrote:
There are several database apis that use a thread local or singleton
reference to a connection pool (we use ScalikeJDBC currently, but there are
others).
You can use mapPartitions earlier in the chain to make sure
Greetings.
I have been following some of the tutorials online for Spark k-means
clustering. I would like to be able to just dump all the cluster values
and their centroids to text file so I can explore the data. I have the
clusters as such:
val clusters = KMeans.train(parsedData, numClusters,
Spark version is 1.3.0 (will upgrade as soon as we upgrade past mesos
0.19.0)...
Regardless, I'm running into a really weird situation where when I pass
--jars to bin/spark-shell I can't reference those classes on the repl. Is
this expected? The logs even tell me that my jars have been added, and
On 12 Jun 2015, at 23:19, Cody Koeninger c...@koeninger.org wrote:
A. No, it's called once per partition. Usually you have more partitions than
executors, so it will end up getting called multiple times per executor. But
you can use a lazy val, singleton, etc to make sure the setup only
Hi,
If you want I would be happy to work in this. I have worked with
KafkaUtils.createDirectStream before, in a pull request that wasn't
accepted https://github.com/apache/spark/pull/5367. I'm fluent with Python
and I'm starting to feel comfortable with Scala, so if someone opens a JIRA
I can
A. No, it's called once per partition. Usually you have more partitions
than executors, so it will end up getting called multiple times per
executor. But you can use a lazy val, singleton, etc to make sure the
setup only takes place once per JVM.
B. I cant speak to the specifics there ... but
Close. the mapPartitions call doesn't need to do anything at all to the
iter.
mapPartitions { iter =
SomeDb.conn.init
iter
}
On Fri, Jun 12, 2015 at 3:55 PM, algermissen1971 algermissen1...@icloud.com
wrote:
Cody,
On 12 Jun 2015, at 17:26, Cody Koeninger c...@koeninger.org wrote:
On 12 Jun 2015, at 22:59, Cody Koeninger c...@koeninger.org wrote:
Close. the mapPartitions call doesn't need to do anything at all to the iter.
mapPartitions { iter =
SomeDb.conn.init
iter
}
Yes, thanks!
Maybe you can confirm two more things and then you helped me make a giant
To be concrete, say we have a folder with thousands of tab-delimited csv
files with following attributes format (each csv file is about 10GB):
idnameaddresscity...
1Mattadd1LA...
2Willadd2LA...
3Lucyadd3SF...
...
And we have a
Anyone met the same problem like me?
2015-06-12 23:40 GMT+08:00 Tao Li litao.bupt...@gmail.com:
Hi all:
I complied new spark 1.4.0 version today. But when I run WordCount demo,
it throws NoSuchMethodError *java.lang.NoSuchMethodError:
Turns out one of the other developers wrapped the jobs in script and did a
cd to another folder in the script before executing spark-submit.
On 12 June 2015 at 14:20, Matthew Jones mle...@gmail.com wrote:
Hmm either spark-submit isn't picking up the relative path or Chronos is
not setting your
Looks like your spark is not able to pick up the HADOOP_CONF. To fix this,
you can actually add jets3t-0.9.0.jar to the classpath
(sc.addJar(/path/to/jets3t-0.9.0.jar).
Thanks
Best Regards
On Thu, Jun 11, 2015 at 6:44 PM, shahab shahab.mok...@gmail.com wrote:
Hi,
I tried to read a csv file
This is a good start, if you haven't read it already
http://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations
Thanks
Best Regards
On Thu, Jun 11, 2015 at 8:17 PM, 唐思成 jadetan...@qq.com wrote:
Hi all:
We are trying to using spark to do some real
Hello.
I used to be able to run debug my Spark apps in Eclipse for Spark 1.3.1 by
creating a launch and setting the vm var -Dspark.master=local[4].
I am not playing with the new 1.4 and trying out some of my simple samples,
which all fail with the same exception as shown below. Running them with
You can disable shuffle spill (spark.shuffle.spill
http://spark.apache.org/docs/latest/configuration.html#shuffle-behavior)
if you are having enough memory to hold that much data. I believe adding
more resources would be your only choice.
Thanks
Best Regards
On Thu, Jun 11, 2015 at 9:46 PM, Al M
You can verify if the jars are shipped properly by looking at the driver UI
(running on 4040) Environment tab.
Thanks
Best Regards
On Sat, Jun 13, 2015 at 12:43 AM, Jonathan Coveney jcove...@gmail.com
wrote:
Spark version is 1.3.0 (will upgrade as soon as we upgrade past mesos
0.19.0)...
It launches two jobs because it doesn't know ahead of time how big your RDD
is, so it doesn't know what the sampling rate should be. After counting
all the records, it can determine what the sampling rate should be -- then
it does another pass through the data, sampling by the rate its just
How many cores are you allocating for your job? And how many receivers are
you having? It would be good if you can post your custom receiver code, it
will help people to understand it better and shed some light.
Thanks
Best Regards
On Fri, Jun 12, 2015 at 12:58 PM, Chaudhary, Umesh
BTW, in Spark 1.4 announced today, I added SQLContext.getOrCreate. So you
dont need to create the singleton yourself.
On Wed, Jun 10, 2015 at 3:21 AM, Sergio Jiménez Barrio
drarse.a...@gmail.com wrote:
Note: CCing user@spark.apache.org
First, you must check if the RDD is empty:
Hi,
I have a scenario where I'd like to store a RDD using parquet format in
many files, which corresponds to days, such as 2015/01/01, 2015/02/02, etc.
So far I used this method
http://stackoverflow.com/questions/23995040/write-to-multiple-outputs-by-key-spark-one-spark-job
to store text files
2. Does 1.3.2 or 1.4 have any enhancements that can help? I tried to use
1.3.1 but SPARK-6967 prohibits me from doing so.Now that 1.4 is
available, would any of the JOIN enhancements help this situation?
I would try Spark 1.4 after running SET
spark.sql.planner.sortMergeJoin=true.
For example, Hive lets you set a whole bunch of parameters (# of reducers, #
of mappers, size of reducers, cache size, max memory to use for a join),
while Impala gives users a much smaller subset of parameters to work with,
which makes it nice to give to a BI team.
Is there a way to restrict
Hi Andrew,
Thanks a lot! Indeed, it doesn't start with spark, the following properties
are read by implementation of the driver rather than spark conf:
--conf spooky.root=s3n://spooky- \
--conf spooky.checkpoint=s3://spooky-checkpoint \
This used to work from Spark 1.0.0 to 1.3.1. Do you know
Hi Team,
Sharing one article which summarize the Resource allocation configurations
for Spark on Yarn:
Resource allocation configurations for Spark on Yarn
http://www.openkb.info/2015/06/resource-allocation-configurations-for.html
--
Thanks,
www.openkb.info
(Open KnowledgeBase for
Hey all,
I've recently run into an issue where spark dynamicAllocation has asked for
-1 executors from YARN. Unfortunately, this raises an exception that kills
the executor-allocation thread and the application can't request more
resources.
Has anyone seen this before? It is spurious and the
Hi,
I want to use spark to select N columns, top M rows of all csv files under
a folder.
To be concrete, say we have a folder with thousands of tab-delimited csv
files with following attributes format (each csv file is about 10GB):
idnameaddresscity...
1Mattadd1
This is the SPARK JIRA which introduced the warning:
[SPARK-7037] [CORE] Inconsistent behavior for non-spark config properties
in spark-shell and spark-submit
On Fri, Jun 12, 2015 at 4:34 PM, Peng Cheng rhw...@gmail.com wrote:
Hi Andrew,
Thanks a lot! Indeed, it doesn't start with spark, the
Hi Juan,
I have created a ticket for this:
https://issues.apache.org/jira/browse/SPARK-8337
Thanks!
Amit
On Fri, Jun 12, 2015 at 3:17 PM, Juan Rodríguez Hortalá
juan.rodriguez.hort...@gmail.com wrote:
Hi,
If you want I would be happy to work in this. I have worked with
Created PR and verified the example given by Justin works with the change:
https://github.com/apache/spark/pull/6793
Cheers
On Wed, Jun 10, 2015 at 7:15 PM, Ted Yu yuzhih...@gmail.com wrote:
Looks like the NPE came from this line:
@transient protected lazy val rng = new XORShiftRandom(seed
Spark 1.4 supports dynamic partitioning, you can first convert your RDD
to a DataFrame and then save the contents partitioned by date column.
Say you have a DataFrame df containing three columns a, b, and c, you
may have something like this:
df.write.partitionBy(a,
Thanks all for your information. Andrew, I dig out one of your old post
which is relevant:
http://apache-spark-user-list.1001560.n3.nabble.com/little-confused-about-SPARK-JAVA-OPTS-alternatives-td5798.html
But didn't mention how to supply the properties that don't start with spark.
On 12 June
Greetings,
I am trying to implement a classic star schema ETL pipeline using Spark
SQL, 1.2.1. I am running into problems with shuffle joins, for those
dimension tables which have skewed keys and are too large to let Spark
broadcast them.
I have a few questions
1. Can I split my queries so a
I would like to have a Spark Streaming SQS Receiver which deletes SQS
messages only after they were successfully stored on S3.
For this a Custom Receiver can be implemented with the semantics of
the Reliable Receiver.
The store(multiple-records) call blocks until the given records have
been
Maybe it's related to a bug, which is fixed by
https://github.com/apache/spark/pull/6558 recently.
On Fri, Jun 12, 2015 at 5:38 AM, Bipin Nag bipin@gmail.com wrote:
Hi Cheng,
Yes, some rows contain unit instead of decimal values. I believe some rows
from original source I had don't have
Hi,
I am taking Broadcast value from file. I want to use it creating Rating
Object (ALS) .
But I am getting null. Here is my code
https://gist.github.com/yaseminn/d6afd0263f6db6ea4ec5 :
At lines 17 18 is ok but 19 returns null so 21 returns me error. Why I
don't understand.Do you have any idea
Hi I am trying to write data that is produced from kafka commandline
producer for some topic. I am facing problem and unable to proceed. Below
is my code which I am creating a jar and running through spark-submit on
spark-shell. Am I doing wrong inside foreachRDD() ? What is wrong with
Thanks Todd, that solved my problem.
Regards,
Gaurav
(please excuse spelling mistakes)
Sent from phone
On Jun 11, 2015 6:42 PM, Todd Nist tsind...@gmail.com wrote:
Hi Gaurav,
Seems like you could use a broadcast variable for this if I understand
your use case. Create it in the driver based
That's a great idea. I did what you suggested and added the url to the
props file in the uri of the json. The properties file now shows up in the
sandbox. But when it goes to run spark-submit with --properties-file
props.properties it fails to find it:
Exception in thread main
Hello,
Just in case someone finds the same issue, it was caused by running the
streaming app with different version of the cluster jars (the uber jar
contained both yarn and spark).
Regards
J
--
View this message in context:
These are both really good posts: you should try and get them in to the
documentation.
with anything implementing dynamicness, there are some fun problems
(a) detecting the delays in the workflow. There's some good ideas here
(b) deciding where to address it. That means you need to monitor the
The other thing to keep in mind about spark window operations against Kafka
is that spark streaming is based on current system clock, not the time
embedded in your messages.
So you're going to get a fundamentally wrong answer from a window operation
after a failure / restart, regardless of
I think the problem is that in my local etc/hosts file, I have
10.196.116.95 WIN02
I will remove it and try. Thanks for the help.
Ningjun
From: prajod.vettiyat...@wipro.com [mailto:prajod.vettiyat...@wipro.com]
Sent: Friday, June 12, 2015 1:44 AM
To: Wang, Ningjun (LNG-NPV)
Cc:
Hi,
I am using Kafka Spark cluster for real time aggregation analytics use case
in production.
*Cluster details*
*6 nodes*, each node running 1 Spark and kafka processes each.
Node1 - 1 Master , 1 Worker, 1 Driver,
1 Kafka process
Node 2,3,4,5,6 - 1 Worker prcocess each
I’m trying to iterate through a list of Columns and create new Columns based on
a condition. However, the when method keeps giving me errors that don’t quite
make sense.
If I do `when(col === “abc”, 1).otherwise(0)` I get the following error at
compile time:
[error] not found: value when
Hi,
I have a scenario with spark streaming, where I need to write to a database
from within updateStateByKey[1].
That means that inside my update function I need a connection.
I have so far understood that I should create a new (lazy) connection for every
partition. But since I am not working
I'm using Play-2.4 with play-json-2.4, It works fine with spark-1.3.1, but it
failed after I upgrade Spark to spark-1.4.0:(
sc.parallelize(1 to 1).count
code
[info] com.fasterxml.jackson.databind.JsonMappingException: Could not find
creator property with name 'id' (in class
I see the same thing in an app that uses Jackson 2.5. Downgrading to
2.4 made it work. I meant to go back and figure out if there's
something that can be done to work around this in Spark or elsewhere,
but for now, harmonize your Jackson version at 2.4.x if you can.
On Fri, Jun 12, 2015 at 4:20
Hi Chris,
Have you imported org.apache.spark.sql.functions._?
Thanks,
Yin
On Fri, Jun 12, 2015 at 8:05 AM, Chris Freeman cfree...@alteryx.com wrote:
I’m trying to iterate through a list of Columns and create new Columns
based on a condition. However, the when method keeps giving me errors
Hi all:
I complied new spark 1.4.0 version today. But when I run WordCount demo, it
throws NoSuchMethodError *java.lang.NoSuchMethodError:
com.fasterxml.jackson.module.scala.deser.BigDecimalDeserialize*r.
I found the default *fasterxml.jackson.version* is *2.4.4*. It's there
any wrong with the
Trying to find a complete documentation about an internal architecture of
Apache Spark, but have no results there.
For example I'm trying to understand next thing: Assume that we have 1Tb
text file on HDFS (3 nodes in a cluster, replication factor is 1). This
file will be spitted into 128Mb
There are several database apis that use a thread local or singleton
reference to a connection pool (we use ScalikeJDBC currently, but there are
others).
You can use mapPartitions earlier in the chain to make sure the connection
pool is set up on that executor, then use it inside updateStateByKey
That did it! Thanks!
From: Yin Huai
Date: Friday, June 12, 2015 at 10:31 AM
To: Chris Freeman
Cc: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Issues with `when` in Column class
Hi Chris,
Have you imported org.apache.spark.sql.functions._?
Thanks,
Yin
On Fri, Jun 12, 2015
Hmm either spark-submit isn't picking up the relative path or Chronos is
not setting your working directory to your sandbox. Try using cd
$MESOS_SANDBOX spark-submit --properties-file props.properties
On Fri, Jun 12, 2015 at 12:32 PM Gary Ogden gog...@gmail.com wrote:
That's a great idea. I
For that you need SPARK-1537 and the patch to go with it
It is still the spark web UI, it just hands off storage and retrieval of the
history to the underlying Yarn timeline server, rather than through the
filesystem. You'll get to see things as they go along too.
If you do want to try it,
The scala api has 2 ways of calling createDirectStream. One of them allows
you to pass a message handler that gets full access to the kafka
MessageAndMetadata, including offset.
I don't know why the python api was developed with only one way to call
createDirectStream, but the first thing I'd
Sent from my phone
On Jun 11, 2015, at 8:43 AM, Sanjay Subramanian
sanjaysubraman...@yahoo.com.INVALID wrote:
hey guys
Using Hive and Impala daily intensively.
Want to transition to spark-sql in CLI mode
Currently in my sandbox I am using the Spark (standalone mode) in the CDH
Hey Kiran,
Thanks very much for the response. I left for vacation before I could try
this out, but I'll experiment once I get back and let you know how it goes.
Thanks!
James.
On 8 June 2015 at 12:34, kiran lonikar loni...@gmail.com wrote:
It turns out my assumption on load and unionAll
Thanks guys, my question must look like a stupid one today :) Looking
forward to test out 1.4.0, just downloaded it.
Congrats to the team for this much anticipate release.
On Fri, Jun 12, 2015 at 10:12 PM, Guru Medasani gdm...@gmail.com wrote:
Here is a spark 1.4 release blog by data bricks.
You don't add dependencies to your app -- you mark Spark as 'provided'
in the build and you rely on the deployed Spark environment to provide
it.
On Fri, Jun 12, 2015 at 7:14 PM, Elkhan Dadashov elkhan8...@gmail.com wrote:
Hi all,
We want to integrate Spark in our Java application using the
In Spark 1.3.x, the system property of the driver can be set by --conf
option, shared between setting spark properties and system properties.
In Spark 1.4.0 this feature is removed, the driver instead log the following
warning:
Warning: Ignoring non-spark config property: xxx.xxx=v
How do
Hi Peng,
Setting properties through --conf should still work in Spark 1.4. From the
warning it looks like the config you are trying to set does not start with
the prefix spark.. What is the config that you are trying to set?
-Andrew
2015-06-12 11:17 GMT-07:00 Peng Cheng pc...@uow.edu.au:
In
Tom,
Can you file a JIRA and attach a small reproducible test case if possible?
On Tue, May 19, 2015 at 1:50 PM, Thomas Dudziak tom...@gmail.com wrote:
Under certain circumstances that I haven't yet been able to isolate, I get
the following error when doing a HQL query using HiveContext
Thanks for prompt response, Sean.
The issue is that we are restricted on dependencies we can include in our
project.
There are 2 issues while including dependencies:
1) there are several dependencies which we and Spark has, but the versions
are conflicting.
2) there are dependencies Spark has,
Hi all,
We want to integrate Spark in our Java application using the Spark Java Api
and run then on the Yarn clusters.
If i want to run Spark on Yarn, which dependencies are must for including ?
I looked at Spark POM
Hi all,
I am running spark standalone (local[*]), and have tried to cut back on
some of the logging noise from the framework by editing log4j.properties in
spark/conf.
The following lines are working as expected:
log4j.logger.org.apache.spark=WARN
(This question is also present on StackOverflow
http://stackoverflow.com/questions/30656083/spark-pyspark-errors-on-mysterious-missing-tmp-file
)
I'm having issues with pyspark and a missing /tmp file. I've narrowed down
the behavior to a short snippet.
a=sc.parallelize([(16646160,1)]) #
It sounds like this might be caused by a memory configuration problem. In
addition to looking at the executor memory, I'd also bump up the driver memory,
since it appears that your shell is running out of memory when collecting a
large query result.
Sent from my phone
On Jun 11, 2015, at
92 matches
Mail list logo