Re: Using SparkContext in Executors

2017-05-28 Thread ayan guha
Hi You can modify your parse function to yield/emit the output record, instead of inserting. that way, you can essentially call .toDF to convert the outcome to a dataframe and then use driver's cassandra connection to save to cassandra (data will still in Executors, but now connector itself will

??????Spark sql 2.1.0 thrift jdbc server - create table xxx as select * from yyy sometimes get error

2017-05-28 Thread ????
Hi all, Any ideas, on this error stack trace? 17/05/29 08:44:53 ERROR thriftserver.SparkExecuteStatementOperation: Error executing query, currentState RUNNING, org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source

??????Spark sql 2.1.0 thrift jdbc server - create table xxx as select * from yyy sometimes get error

2017-05-28 Thread ????
Hi all, After upgrade to spark 2.1.1, now I get more error details. Seems like it is a hdfs permission error. Any ideas? Why the first time it will work, but then same statement will result in unable to move source? Error: org.apache.spark.sql.AnalysisException:

Re: Using SparkContext in Executors

2017-05-28 Thread Stephen Boesch
You would need to use *native* Cassandra API's in each Executor - not org.apache.spark.sql.cassandra.CassandraSQLContext - including to create a separate Cassandra connection on each Executor. 2017-05-28 15:47 GMT-07:00 Abdulfattah Safa : > So I can't run SQL queries in

Re: Using SparkContext in Executors

2017-05-28 Thread Abdulfattah Safa
So I can't run SQL queries in Executors ? On Sun, May 28, 2017 at 11:00 PM Mark Hamstra wrote: > You can't do that. SparkContext and SparkSession can exist only on the > Driver. > > On Sun, May 28, 2017 at 6:56 AM, Abdulfattah Safa > wrote: > >>

Re: [Spark Streaming] DAG Output Processing mechanism

2017-05-28 Thread Nipun Arora
Apogies - Resending as the previous mail went with some unnecessary copy paste. I would like some clarification on the execution model for spark streaming. Broadly, I am trying to understand if output operations in a DAG are only processed after all intermediate operations are finished for all

Re: Using SparkContext in Executors

2017-05-28 Thread Mark Hamstra
You can't do that. SparkContext and SparkSession can exist only on the Driver. On Sun, May 28, 2017 at 6:56 AM, Abdulfattah Safa wrote: > How can I use SparkContext (to create Spark Session or Cassandra Sessions) > in executors? > If I pass it as parameter to the foreach

[Spark Streaming] DAG Output Processing mechanism

2017-05-28 Thread Nipun Arora
up vote 0 down vote favorite I would like some clarification on the execution model for spark streaming. Broadly, I am trying to understand if output operations in a DAG are only processed after all intermediate operations are finished for all parts of the DAG. Let me give an example: I have a

Using SparkContext in Executors

2017-05-28 Thread Abdulfattah Safa
How can I use SparkContext (to create Spark Session or Cassandra Sessions) in executors? If I pass it as parameter to the foreach or foreachpartition, then it will have a null value. Shall I create a new SparkContext in each executor? Here is what I'm trying to do: Read a dump directory with

??????Spark sql 2.1.0 thrift jdbc server - create table xxx as select * from yyy sometimes get error

2017-05-28 Thread ????
Hi, Here're the sql I'm using. Using this 2 statements in beeline will just work fine. But with spring org.springframework.jdbc.datasource.SimpleDriverDataSource, no matter using jdbcTemplate, or datasource.getConnection and createStatement after execute these 2 statements, the second create

Spark sql 2.1.0 thrift jdbc server - create table xxx as select * from yyy sometimes get error

2017-05-28 Thread ????
Hey guys, I've post a question here: https://stackoverflow.com/questions/44223024/spark-sql-2-1-0-create-table-xxx-as-select-from-yyy-sometimes-get-error Thrift sql server, sometimes can't execute "create table as select" statement until restart. When this happens, the spark job/stage gives

[WARN] org.apache.parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextI

2017-05-28 Thread Mendelson, Assaf
Hi, I am getting the following warning: [WARN] org.apache.parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl This seems to occur every time I try to read from