Hi Jean,
We prepare the data for all another jobs. We have a lot of jobs that
schedule to different time but all of them need to read same raw data.
On Fri, Nov 3, 2017 at 12:49 PM Jean Georges Perrin
wrote:
> Hi Oren,
>
> Why don’t you want to use a GroupBy? You can cache or check
Hi Oren,
Why don’t you want to use a GroupBy? You can cache or checkpoint the result and
use it in your process, keeping everything in Spark and avoiding
save/ingestion...
> On Oct 31, 2017, at 08:17, אורן שמון <oren.sha...@gmail.com> wrote:
>
> I have 2 spark jobs one is
I have 2 spark jobs one is pre-process and the second is the process.
Process job needs to calculate for each user in the data.
I want to avoid shuffle like groupBy so I think about to save the result
of the pre-process as bucket by user in Parquet or to re-partition by user
and save the result .
oops sorry. Please ignore this. wrong mailing list
Hi All,
I read the docs however I still have the following question For Stateful
stream processing is HDFS mandatory? because In some places I see it is
required and other places I see that rocksDB can be used. I just want to
know if HDFS is mandatory for Stateful stream processing?
Thanks!
Hi, Adam, great thanks for your detailed reply, the three videos are
very referential for me. Actually, the App submitted to IBM Spark Contest
is a very small demo, I'll do much more work to enhance that model, and
recently we just started a new project which aims to building a platform
that
Hi, yes, there's definitely a market for Apache Spark and financial
institutions, I can't provide specific details but to answer your survey:
"yes" and "more than a few GB!"
Here are a couple of examples showing Spark with financial data, full
disclosure that I wo
Hi, guys,
I'm a quant engineer in China, and I believe it's very promising when
using Spark in the financial market. But I didn't find cases which combine
spark and finance.
So here I wanna do a small survey:
- do you guys use Spark in financial market related proje
I found there are several .conf files in the conf directory, which one is
used as the default one when I click the "new" button on the notebook
homepage? I want to edit the default profile configuration so all my
notebooks are created with custom settings.
--
Thanks,
David S.
Hello test
Hi all,
I have to deal with a lot of data, and I use spark for months.
Now I try to use Vectors.sparse to generate a large vector of features, but the
feature size may exceed 4 billion, above max of int, so I want to use BigInt or
Long type to deal with it.
But I read code and document that
Hi, Siddharth
You can re build spark with maven by specifying -Dhadoop.version=2.5.0
Thanks,
Sun.
fightf...@163.com
From: Siddharth Ubale
Date: 2015-01-30 15:50
To: user@spark.apache.org
Subject: Hi: hadoop 2.5 for spark
Hi ,
I am beginner with Apache spark.
Can anyone let me know if it
You can use prebuilt version that is built upon hadoop2.4.
From: Siddharth Ubale
Date: 2015-01-30 15:50
To: user@spark.apache.org
Subject: Hi: hadoop 2.5 for spark
Hi ,
I am beginner with Apache spark.
Can anyone let me know if it is mandatory to build spark with the Hadoop
version I am
Hi ,
I am beginner with Apache spark.
Can anyone let me know if it is mandatory to build spark with the Hadoop
version I am using or can I use a pre built package and use it with my existing
HDFS root folder?
I am using Hadoop 2.5.0 and want to use Apache spark 1.2.0 with it.
I could see a pre
Hi,
I just wanted to say hi all to the Spark community. I'm developing some
stuff right now using Spark (we've started very recently). As the API
documentation of Spark is really really good, I like to get deeper
knowledge of the internal stuff -you know, the goodies. Watching movies
Hi,
Actually several java task threads running in a single executor, not processes,
so each executor will only have one JVM runtime which shares with different
task threads.
Thanks
Jerry
From: rapelly kartheek [mailto:kartheek.m...@gmail.com]
Sent: Wednesday, August 20, 2014 5:29 PM
To: user
Hi
I have this doubt:
I understand that each java process runs on different JVM instances. Now,
if I have a single executor on my machine and run several java processes,
then there will be several JVM instances running.
Now, process_local means, the data is located on the same JVM as the task
ocalhost:7077.
>>
>> Thanks
>> Best Regards
>>
>>
>> On Mon, Jun 23, 2014 at 10:56 AM, rapelly kartheek <
>> kartheek.m...@gmail.com> wrote:
>>
>>> Hi
>>> Can someone help me with the following error that I faced while
>>> se
g your spark shell instead of
> localhost:7077.
>
> Thanks
> Best Regards
>
>
> On Mon, Jun 23, 2014 at 10:56 AM, rapelly kartheek <
> kartheek.m...@gmail.com> wrote:
>
>> Hi
>> Can someone help me with the following error that I faced while setti
Open your webUI in the browser and see the spark url in the top left corner
of the page and use it while starting your spark shell instead of
localhost:7077.
Thanks
Best Regards
On Mon, Jun 23, 2014 at 10:56 AM, rapelly kartheek
wrote:
> Hi
> Can someone help me with the following
Please check what is the spark master url. Set that url while launching
spark-shell
You can get it from the terminal where spark master is running or from
cluster ui. http://:8080
Thanks,
Sourav
On Mon, Jun 23, 2014 at 10:56 AM, rapelly kartheek
wrote:
> Hi
> Can someone help me wi
Hi
Can someone help me with the following error that I faced while setting
up single node spark framework.
karthik@karthik-OptiPlex-9020:~/spark-1.0.0$ MASTER=spark://localhost:7077
sbin/spark-shell
bash: sbin/spark-shell: No such file or directory
karthik@karthik-OptiPlex-9020:~/spark-1.0.0
22 matches
Mail list logo