tryspark.local.dir property.
On Wed, Dec 5, 2018 at 1:42 PM JF Chen wrote:
> I have two spark apps writing data to one directory. I notice they share
> one temp directory, and the spark fist finish writing will clear the temp
> directory and the slower one may throw "No lease on *** File does
You can use persist() or cache() operation on DataFrame.
On Tue, Dec 26, 2017 at 4:02 PM Shu Li Zheng wrote:
> Hi all,
>
> I have a scenario like this:
>
> val df = dataframe.map().filter()
> // agg 1
> val query1 = df.sum.writeStream.start
> // agg 2
> val query2 =
Hi All,
I am getting following error message while applying
*flatMapGroupsWithState.*
*Exception in thread "main" org.apache.spark.sql.AnalysisException:
flatMapGroupsWithState in update mode is not supported with aggregation on
a streaming DataFrame/Dataset;;*
Following is what I am trying to
"y").agg(...)
>
> Is this you want ?
>
> On Fri, Dec 8, 2017 at 11:51 AM, Sandip Mehta <sandip.mehta@gmail.com>
> wrote:
>
>> Hi,
>>
>> During my aggregation I end up having following schema.
>>
>> Row(Row(val1,val2), Row(val1,val2,v
Hi,
During my aggregation I end up having following schema.
Row(Row(val1,val2), Row(val1,val2,val3...))
val values = Seq(
(Row(10, 11), Row(10, 2, 11)),
(Row(10, 11), Row(10, 2, 11)),
(Row(20, 11), Row(10, 2, 11))
)
1st tuple is used to group the relevant records for
Hi All,
Is it fair to say that, number of jobs in a given spark streaming application
is equal to number of actions in an application?
Regards
Sandeep
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
Hi All,
I have a table created in Hive and stored/read using HCatalog. Table is in ORC
format. I want to read this table in Spark SQL and do the join with RDDs. How
can i connect to HCatalog and get data from Spark SQL?
SM
ows.)
>
> I had a separate set of Spark jobs that pulled the raw data from Cassandra,
> computed the aggregations and more complex metrics, and wrote it back to the
> relevant Cassandra tables. These jobs ran periodically every few minutes.
>
> Regards,
> Sanket
dedicated data
> storage systems. Like a database, or a key-value store. Spark Streaming would
> just aggregate and push the necessary data to the data store.
>
> TD
>
> On Sat, Nov 14, 2015 at 9:32 PM, Sandip Mehta <sandip.mehta@gmail.com
> <mailto:sandip.mehta@
tter. Otherwise, second and third approaches are more
> efficient.
>
> TD
>
>
> On Wed, Nov 18, 2015 at 7:15 AM, Sandip Mehta <sandip.mehta@gmail.com
> <mailto:sandip.mehta@gmail.com>> wrote:
> TD thank you for your reply.
>
> I agree on data st
Hi,
I am working on requirement of calculating real time metrics and building
prototype on Spark streaming. I need to build aggregate at Seconds, Minutes,
Hours and Day level.
I am not sure whether I should calculate all these aggregates as different
Windowed function on input DStream or
I believe you’ll have to use another way of creating StreamingContext by
passing create function in getOrCreate function.
private def setupSparkContext(): StreamingContext = {
val streamingSparkContext = {
val sparkConf = new
SparkConf().setAppName(config.appName).setMaster(config.master)
Does this help ?
final JavaHBaseContext hbaseContext = new JavaHBaseContext(javaSparkContext,
conf);
customerModels.foreachRDD(new Function() {
private static final long serialVersionUID = 1L;
@Override
public Void call(JavaRDD currentRDD) throws Exception {
anning.com/books/spark-graphx-in-action>
>
>
>
>
>
>> On 1 Oct 2015, at 23:06, Sandip Mehta <sandip.mehta@gmail.com
>> <mailto:sandip.mehta@gmail.com>> wrote
anning.com/books/spark-graphx-in-action>
>
>
>
>
>
>> On 1 Oct 2015, at 23:06, Sandip Mehta <sandip.mehta@gmail.com
>> <mailto:sandip.mehta@gmail.com>> wrote
Hi,
I wanted to understand what is the purpose of Call Site in Spark Context?
Regards
SM
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
16 matches
Mail list logo