date:20170806

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

2017-08-06 Thread shyla deshpande

rdd.map { record => ()}.saveToCassandra("keyspace1", "table1") --> is running on executor stream1.asInstanceOf[CanCommitOffsets].commitAsync(offsetRanges) --> is running on driver. Is this the reason why kafka offsets are committed even when an exception is raised? If so is there a way to commit

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

2017-08-06 Thread shyla deshpande

Thanks again Cody, My understanding is all the code inside foreachRDD is running on the driver except for rdd.map { record => ()}.saveToCassandra("keyspace1", "table1"). When the exception is raised, I was thinking I won't be committing the offsets, but the offsets are committed all the time inde

Re: How can i split dataset to multi dataset

2017-08-06 Thread Deepak Sharma

This can be mapped as below: dataset.map(x=>((x(0),x(1),x(2)),x) This works with Dataframe of rows but i haven't tried with dataset Thanks Deepak On Mon, Aug 7, 2017 at 8:21 AM, Jone Zhang wrote: > val schema = StructType( > Seq( > StructField("app", StringType, nullable = true)

spark-shell not getting launched - Queue's AM resource limit exceeded.

2017-08-06 Thread karan alang

Hello - i've HDP 2.5.x and i'm trying to launch spark-shell .. ApplicationMaster gets launched, but YARN is not able to assign containers. *Command ->* ./bin/spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m *Error ->* [Sun Aug 06 19:33:29 + 2017] Application is a

How can i split dataset to multi dataset

2017-08-06 Thread Jone Zhang

val schema = StructType( Seq( StructField("app", StringType, nullable = true), StructField("server", StringType, nullable = true), StructField("file", StringType, nullable = true), StructField("...", StringType, nullable = true) ) ) val row =

Unsubscribe

2017-08-06 Thread 郭鹏飞

Unsubscribe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

2017-08-06 Thread Cody Koeninger

I mean that the kafka consumers running on the executors should not be automatically committing, because the fact that a message was read by the consumer has no bearing on whether it was actually successfully processed after reading. It sounds to me like you're confused about where code is running

Re: SPARK Issue in Standalone cluster

2017-08-06 Thread Gourav Sengupta

Hi Marco, thanks a ton, I will surely use those alternatives. Regards, Gourav Sengupta On Sun, Aug 6, 2017 at 3:45 PM, Marco Mistroni wrote: > Sengupta > further to this, if you try the following notebook in databricks cloud, > it will read a .csv file , write to a parquet file and read it a

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

2017-08-06 Thread shyla deshpande

Thanks Cody for your response. All I want to do is, commit the offsets only if I am successfully able to write to cassandra database. The line //save the rdd to Cassandra database is rdd.map { record => ()}.saveToCassandra("kayspace1", "table1") What do you mean by Executors shouldn't be auto-co

Re: spark-shell - modes

2017-08-06 Thread karan alang

update - seems 'spark-shell' does not support mode -> yarn-cluster (i guess since it is an interactive shell) The only modes supported include -> yarn-client & local Pls let me know if my understanding is incorrect. Thanks! On Sun, Aug 6, 2017 at 10:07 AM, karan alang wrote: > Hello all - i'd

spark-shell - modes

2017-08-06 Thread karan alang

Hello all - i'd a basic question on the modes in which spark-shell can be run .. when i run the following command, does Spark run in local mode i.e. outside of YARN & using the local cores ? (since '--master' option is missing) ./bin/spark-shell --driver-memory 512m --executor-memory 512m Simila

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

2017-08-06 Thread Cody Koeninger

If your complaint is about offsets being committed that you didn't expect... auto commit being false on executors shouldn't have anything to do with that. Executors shouldn't be auto-committing, that's why it's being overridden. What you've said and the code you posted isn't really enough to expl

Re: SPARK Issue in Standalone cluster

2017-08-06 Thread Marco Mistroni

Sengupta further to this, if you try the following notebook in databricks cloud, it will read a .csv file , write to a parquet file and read it again (just to count the number of rows stored) Please note that the path to the csv file might differ for you. So, what you will need todo is 1 - cre

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

Re: How can i split dataset to multi dataset

spark-shell not getting launched - Queue's AM resource limit exceeded.

How can i split dataset to multi dataset

Unsubscribe

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

Re: SPARK Issue in Standalone cluster

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

Re: spark-shell - modes

spark-shell - modes

Re: kafka settting, enable.auto.commit to false is being overridden and I lose data. please help!

Re: SPARK Issue in Standalone cluster

13 matches

Site Navigation

Mail list logo

Footer information