RE: Schema Evolution Parquet vs Avro

2017-05-29 Thread Venkata ramana gollamudi
Hi Sam, You can consider checking Carbondata format(https://github.com/apache/carbondata). It supports Column removal and Datatype change of existing column. Column rename you can raise a issue to support. Regards, Ramana From: Joel D [games2013@gmail.com]

Re: [Spark Streaming] DAG Output Processing mechanism

2017-05-29 Thread Nipun Arora
Sending out the message again.. Hopefully someone cal clarify :) I would like some clarification on the execution model for spark streaming. Broadly, I am trying to understand if output operations in a DAG are only processed after all intermediate operations are finished for all parts of the

Schema Evolution Parquet vs Avro

2017-05-29 Thread Joel D
Hi, We are trying to come up with the best storage format for handling schema changes in ingested data. We noticed that both avro and parquet allows one to select based on column name instead of the data index/position of data. However, we are inclined towards parquet for better read performance

Re: Dynamically working out upperbound in JDBC connection to Oracle DB

2017-05-29 Thread Don Drake
Try passing maxID.toString, I think it wants the number as a string. On Mon, May 29, 2017 at 3:12 PM, Mich Talebzadeh wrote: > thanks Gents but no luck! > > scala> val s = HiveContext.read.format("jdbc").options( > | Map("url" -> _ORACLEserver, > | "dbtable"

Re: Dynamically working out upperbound in JDBC connection to Oracle DB

2017-05-29 Thread Mich Talebzadeh
thanks Gents but no luck! scala> val s = HiveContext.read.format("jdbc").options( | Map("url" -> _ORACLEserver, | "dbtable" -> "(SELECT ID, CLUSTERED, SCATTERED, RANDOMISED, RANDOM_STRING, SMALL_VC, PADDING FROM scratchpad.dummy)", | "partitionColumn" -> "ID", | "lowerBound"

Re: Dynamically working out upperbound in JDBC connection to Oracle DB

2017-05-29 Thread ayan guha
You are using maxId as a string literal. Try removing the quotes around maxId On Tue, 30 May 2017 at 2:56 am, Jörn Franke wrote: > I think you need to remove the hyphen around maxid > > On 29. May 2017, at 18:11, Mich Talebzadeh > wrote: > > Hi,

Re: Dynamically working out upperbound in JDBC connection to Oracle DB

2017-05-29 Thread Jörn Franke
I think you need to remove the hyphen around maxid > On 29. May 2017, at 18:11, Mich Talebzadeh wrote: > > Hi, > > This JDBC connection works with Oracle table with primary key ID > > val s = HiveContext.read.format("jdbc").options( > Map("url" -> _ORACLEserver, >

Dynamically working out upperbound in JDBC connection to Oracle DB

2017-05-29 Thread Mich Talebzadeh
Hi, This JDBC connection works with Oracle table with primary key ID val s = HiveContext.read.format("jdbc").options( Map("url" -> _ORACLEserver, "dbtable" -> "(SELECT ID, CLUSTERED, SCATTERED, RANDOMISED, RANDOM_STRING, SMALL_VC, PADDING FROM scratchpad.dummy)", "partitionColumn" -> "ID",

RE: Disable queuing of spark job on Mesos cluster if sufficient resources are not found

2017-05-29 Thread Mevada, Vatsal
Is there any configurable timeout which controls queuing of the driver in Mesos cluster mode or the driver will remain in queue for indefinite until it find resource on cluster? From: Michael Gummelt [mailto:mgumm...@mesosphere.io] Sent: Friday, May 26, 2017 11:33 PM To: Mevada, Vatsal