rdd.map { record => ()}.saveToCassandra("keyspace1", "table1") --> is
running on executor
stream1.asInstanceOf[CanCommitOffsets].commitAsync(offsetRanges) -->
is running on driver.
Is this the reason why kafka offsets are committed even when an
exception is raised? If so is there a way to commit
Thanks again Cody,
My understanding is all the code inside foreachRDD is running on the driver
except for
rdd.map { record => ()}.saveToCassandra("keyspace1", "table1").
When the exception is raised, I was thinking I won't be committing the
offsets, but the offsets are committed all the time inde
This can be mapped as below:
dataset.map(x=>((x(0),x(1),x(2)),x)
This works with Dataframe of rows but i haven't tried with dataset
Thanks
Deepak
On Mon, Aug 7, 2017 at 8:21 AM, Jone Zhang wrote:
> val schema = StructType(
> Seq(
> StructField("app", StringType, nullable = true)
Hello - i've HDP 2.5.x and i'm trying to launch spark-shell ..
ApplicationMaster gets launched, but YARN is not able to assign containers.
*Command ->*
./bin/spark-shell --master yarn-client --driver-memory 512m
--executor-memory 512m
*Error ->*
[Sun Aug 06 19:33:29 + 2017] Application is a
val schema = StructType(
Seq(
StructField("app", StringType, nullable = true),
StructField("server", StringType, nullable = true),
StructField("file", StringType, nullable = true),
StructField("...", StringType, nullable = true)
)
)
val row =
Unsubscribe
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
I mean that the kafka consumers running on the executors should not be
automatically committing, because the fact that a message was read by
the consumer has no bearing on whether it was actually successfully
processed after reading.
It sounds to me like you're confused about where code is running
Hi Marco,
thanks a ton, I will surely use those alternatives.
Regards,
Gourav Sengupta
On Sun, Aug 6, 2017 at 3:45 PM, Marco Mistroni wrote:
> Sengupta
> further to this, if you try the following notebook in databricks cloud,
> it will read a .csv file , write to a parquet file and read it a
Thanks Cody for your response.
All I want to do is, commit the offsets only if I am successfully able to
write to cassandra database.
The line //save the rdd to Cassandra database is
rdd.map { record => ()}.saveToCassandra("kayspace1", "table1")
What do you mean by Executors shouldn't be auto-co
update - seems 'spark-shell' does not support mode -> yarn-cluster (i guess
since it is an interactive shell)
The only modes supported include -> yarn-client & local
Pls let me know if my understanding is incorrect.
Thanks!
On Sun, Aug 6, 2017 at 10:07 AM, karan alang wrote:
> Hello all - i'd
Hello all - i'd a basic question on the modes in which spark-shell can be
run ..
when i run the following command,
does Spark run in local mode i.e. outside of YARN & using the local cores ?
(since '--master' option is missing)
./bin/spark-shell --driver-memory 512m --executor-memory 512m
Simila
If your complaint is about offsets being committed that you didn't
expect... auto commit being false on executors shouldn't have anything
to do with that. Executors shouldn't be auto-committing, that's why
it's being overridden.
What you've said and the code you posted isn't really enough to
expl
Sengupta
further to this, if you try the following notebook in databricks cloud, it
will read a .csv file , write to a parquet file and read it again (just to
count the number of rows stored)
Please note that the path to the csv file might differ for you.
So, what you will need todo is
1 - cre
13 matches
Mail list logo