BUrV8Pw
http://talebzadehmich.wordpress.com
On 11 May 2016 at 03:01, 李明伟 <kramer2...@126.com> wrote:
Hi Mich
From the ps command. I can find four process. 10409 is the master and 10603 is
the worker. 12420 is the driver program and 12578 should be the executor
(worker). Am I right?
Hi Ted
Spark version : spark-1.6.0-bin-hadoop2.6
I tried increase the memory of executor. Still have the same problem.
I can use jmap to capture some thing. But the output is too difficult to
understand.
在 2016-05-11 11:50:14,"Ted Yu" 写道:
Which Spark release
BUrV8Pw
http://talebzadehmich.wordpress.com
On 11 May 2016 at 01:22, 李明伟 <kramer2...@126.com> wrote:
I actually provided them in submit command here:
nohup ./bin/spark-submit --master spark://ES01:7077 --executor-memory 4G
--num-executors 1 --total-executor-cores 1 --conf
"spark.storage.memoryFracti
Talebzadeh
LinkedIn
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com
On 10 May 2016 at 03:12, 李明伟 <kramer2...@126.com> wrote:
Hi Mich
I added some more infor (the spark-env.sh setting and top comm
PCCdOABUrV8Pw
http://talebzadehmich.wordpress.com
On 9 May 2016 at 16:19, 李明伟 <kramer2...@126.com> wrote:
Thanks for all the information guys.
I wrote some code to do the test. Not using window. So only calculating data
for each batch interval. I set the interval to 30 seconds also reduce the size
o
tored in memory is that streaming source is not
persistent source, so you need to have a place to store the data.
On Mon, May 9, 2016 at 4:43 PM, 李明伟 <kramer2...@126.com> wrote:
Thanks.
What if I use batch calculation instead of stream computing? Do I still need
that much memory? For example, if the 24 ho
Thanks.
What if I use batch calculation instead of stream computing? Do I still need
that much memory? For example, if the 24 hour data set is 100 GB. Do I also
need a 100GB RAM to do the one time batch calculation ?
At 2016-05-09 15:14:47, "Saisai Shao" wrote:
Thanks Mich
I guess I did not make my question clear enough. I know the terms like interval
or window. I also know how to use them. The problem is that in my case, I need
to set the window to cover data for 24 hours or 1 hours. I am not sure if it is
a good way because the window is just too
t; But my way is to setup a forever loop to handle continued income data. Not
>>> sure if it is the right way to use spark
Not sure what this mean, do you use spark-streaming, for doing batch job in the
forever loop ?
On Wed, Apr 20, 2016 at 3:55 PM, 李明伟 <kramer2...@126.com>
rk.driver.memory and spark.driver.maxResultSize
On Tue, Apr 19, 2016 at 4:06 PM, 李明伟 <kramer2...@126.com> wrote:
Hi Zhan Zhang
Please see the exception trace below. It is saying some GC overhead limit error
I am not a java or scala developer so it is hard for me to understand these
infor.
Also reading coredump
The memory parameters :--executor-memory 8G --driver-memory 4G. Please note
that the data size is very small. Total size of the data is less than 10M
As per jmap. It is a little hard for me to do so. I am not a java developer. I
will google the jmap first, thanks
Regards
Mingwei
u can use coredump to find what
cause the OOM.
Thanks.
Zhan Zhang
On Apr 18, 2016, at 9:44 PM, 李明伟 <kramer2...@126.com> wrote:
Hi Samaga
Thanks very much for your reply and sorry for the delay reply.
Cassandra or Hive is a good suggestion.
However in my situation I am not
Hi Samaga
Thanks very much for your reply and sorry for the delay reply.
Cassandra or Hive is a good suggestion.
However in my situation I am not sure if it will make sense.
My requirements is that to get the recent 24 hour data to generate report. The
frequency is 5 minute.
So if use
Hi Anthony
Thanks. You are right the api will read all files, no need to merge
At 2016-03-31 20:09:25, "Femi Anthony" wrote:
Also, ssc.textFileStream(dataDir) will read all the files from a directory so
as far as I can see there's no need to merge the files. Just
14 matches
Mail list logo