subject:"Re\: Spark on Kudu"

Re: spark on kudu performance!

2018-07-05 Thread Todd Lipcon

On Mon, Jun 11, 2018 at 5:52 AM, fengba...@uce.cn wrote: > Hi: > > I use kudu official website development documents, use > spark analysis kudu data（kudu's version is 1.6.0）: > > the official code is : > *val df = sqlContext.read.options(Map("kudu.master" -> > "kudu.master:7051","kudu.table"

Re: Spark Streaming + Kudu

2018-03-06 Thread Ravi Kanth

Mike- I actually got a hold of the pid's for the spark executors but facing issues to run the jstack. There are some VM exceptions. I will figure it out and will attach the jstack. Thanks for your patience. On 6 March 2018 at 20:42, Mike Percy wrote: > Hmm, could you try in

Re: Spark Streaming + Kudu

2018-03-06 Thread Mike Percy

Hmm, could you try in spark local mode? i.e. https://jaceklaskowski. gitbooks.io/mastering-apache-spark/content/spark-local.html Mike On Tue, Mar 6, 2018 at 7:14 PM, Ravi Kanth wrote: > Mike, > > Can you clarify a bit on grabbing the jstack for the process? I launched

Re: Spark Streaming + Kudu

2018-03-06 Thread Ravi Kanth

Mike, Can you clarify a bit on grabbing the jstack for the process? I launched my Spark application and tried to get the pid using which I thought I can grab jstack trace during hang. Unfortunately, I am not able to figure out grabbing pid for Spark application. Thanks, Ravi On 6 March 2018 at

Re: Spark Streaming + Kudu

2018-03-06 Thread Ravi Kanth

Yes, I have debugged to find the root cause. Every logger before "table = client.openTable(tableName);" is executing fine and exactly at the point of opening the table, it is throwing the below exception and nothing is being executed after that. Still the Spark batches are being processed and at

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy

Have you considered checking your session error count or pending errors in your while loop every so often? Can you identify where your code is hanging when the connection is lost (what line)? Mike On Mon, Mar 5, 2018 at 9:08 PM, Ravi Kanth wrote: > In addition to my

Re: Spark Streaming + Kudu

2018-03-05 Thread Ravi Kanth

In addition to my previous comment, I raised a support ticket for this issue with Cloudera and one of the support person mentioned below, *"Thank you for clarifying, The exceptions are logged but not re-thrown to an upper layer, so that explains why the Spark application is not aware of the

Re: Spark Streaming + Kudu

2018-03-05 Thread Ravi Kanth

Mike, Thanks for the information. But, once the connection to any of the Kudu servers is lost then there is no way I can have a control on the KuduSession object and so with getPendingErrors(). The KuduClient in this case is becoming a zombie and never returned back till the connection is

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy

Hi Ravi, it would be helpful if you could attach what you are getting back from getPendingErrors() -- perhaps from dumping RowError.toString() from items in the returned array -- and indicate what you were hoping to get back. Note that a RowError can also return to you the Operation

Re: Spark Streaming + Kudu

2018-03-05 Thread Ravi Kanth

Hi Mike, Thanks for the reply. Yes, I am using AUTO_FLUSH_BACKGROUND. So, I am trying to use Kudu Client API to perform UPSERT into Kudu and I integrated this with Spark. I am trying to test a case where in if any of Kudu server fails. So, in this case, if there is any problem in writing,

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy

Hi Ravi, are you using AUTO_FLUSH_BACKGROUND ? You mention that you are trying to use getPendingErrors()

Re: Spark Streaming + Kudu

2018-02-26 Thread Ravi Kanth

Thank Clifford. We are running Kudu 1.4 version. Till date we didn't see any issues in production and we are not losing tablet servers. But, as part of testing I have to generate few unforeseen cases to analyse the application performance. One among that is bringing down the tablet server or

Re: Spark on Kudu Roadmap

2017-04-09 Thread Benjamin Kim

Hi Mike, Thanks for the link. I guess further, deeper Spark integration is slowly coming. But when, we will have to wait and see. Cheers, Ben > On Mar 27, 2017, at 12:25 PM, Mike Percy wrote: > > Hi Ben, > I don't really know so I'll let someone else more familiar with

Re: Spark on Kudu Roadmap

2017-03-27 Thread Benjamin Kim

Hi Mike, I believe what we are looking for is this below. It is an often request use case. Anyone know if the Spark package will ever allow for creating tables in Spark SQL? Such as: CREATE EXTERNAL TABLE USING org.apache.kudu.spark.kudu OPTIONS (Map("kudu.master" -> “",

Re: Spark on Kudu Roadmap

2017-03-27 Thread Mike Percy

Hi Ben, Is there anything in particular you are looking for? Thanks, Mike On Mon, Mar 27, 2017 at 9:48 AM, Benjamin Kim wrote: > Hi, > > Are there any plans for deeper integration with Spark especially Spark > SQL? Is there a roadmap to look at, so I can know what to expect

Re: Spark on Kudu

2016-10-10 Thread Mark Hamstra

I realize that the Spark on Kudu work to date has been based on Spark 1.6, where your statement about Spark SQL relying on Hive is true. In Spark 2.0, however, that dependency no longer exists since Spark SQL essentially copied over the parts of Hive that were needed into Spark itself, and has

Re: Spark on Kudu

2016-09-20 Thread Benjamin Kim

Thanks! > On Sep 20, 2016, at 3:02 PM, Jordan Birdsell > wrote: > > http://kudu.apache.org/docs/developing.html#_kudu_integration_with_spark > > > On Tue, Sep 20, 2016 at 5:00 PM Benjamin

Re: spark on kudu performance!

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark Streaming + Kudu

Re: Spark on Kudu Roadmap

Re: Spark on Kudu Roadmap

Re: Spark on Kudu Roadmap

Re: Spark on Kudu

Re: Spark on Kudu

17 matches

Site Navigation

Mail list logo

Footer information