Re: Kubernetes security context when submitting job through k8s servers

2018-07-09 Thread trung kien
tom SecurityContext > of the driver/executor pods. This will be supported by the solution to > https://issues.apache.org/jira/browse/SPARK-24434. > > On Mon, Jul 9, 2018 at 2:06 PM trung kien wrote: > >> Dear all, >> >> Is there any way to includes security context (

Kubernetes security context when submitting job through k8s servers

2018-07-09 Thread trung kien
Dear all, Is there any way to includes security context ( https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) when submitting job through k8s servers? I'm trying to first spark jobs on Kubernetes through spark-submit: bin/spark-submit --master k8s://https://API_SERVERS

Re: Spark Streaming - Kafka Direct Approach: re-compute from specific time

2016-05-25 Thread trung kien
Ah right i see. Thank you very much. On May 25, 2016 11:11 AM, "Cody Koeninger" <c...@koeninger.org> wrote: > There's an overloaded createDirectStream method that takes a map from > topicpartition to offset for the starting point of the stream. > > On Wed, May 25,

Re: Spark Streaming - Kafka Direct Approach: re-compute from specific time

2016-05-25 Thread trung kien
exing, there's a kafka > improvement proposal for it but it has gotten pushed back to at least > 0.10.1 > > If you want to do this kind of thing, you will need to maintain your > own index from time to offset. > > On Wed, May 25, 2016 at 8:15 AM, trung kien <kient...@gmail.co

Spark Streaming - Kafka Direct Approach: re-compute from specific time

2016-05-25 Thread trung kien
Hi all, Is there any way to re-compute using Spark Streaming - Kafka Direct Approach from specific time? In some cases, I want to re-compute again from specific time (e.g beginning of day)? is that possible? -- Thanks Kien

Re: Correct way to use spark streaming with apache zeppelin

2016-03-13 Thread trung kien
he existing value and >>> trigger your code to push out an updated value to any clients via the >>> websocket. You could use something like a Redis pub/sub channel to trigger >>> the web app to notify clients of an update. >>> >>> There are about

Re: Correct way to use spark streaming with apache zeppelin

2016-03-12 Thread trung kien
ytics" -- do you mean build a report or dashboard that automatically updates as new data comes in? -- Chris Miller On Sat, Mar 12, 2016 at 3:13 PM, trung kien <kient...@gmail.com> wrote: > Hi all, > > I've just viewed some Zeppenlin's videos. The intergration between > Z

Correct way to use spark streaming with apache zeppelin

2016-03-11 Thread trung kien
Hi all, I've just viewed some Zeppenlin's videos. The intergration between Zeppenlin and Spark is really amazing and i want to use it for my application. In my app, i will have a Spark streaming app to do some basic realtime aggregation ( intermediate data). Then i want to use Zeppenlin to do

Re: RDD partition after calling mapToPair

2015-11-24 Thread trung kien
, 2015 12:26 AM, "Cody Koeninger" <c...@koeninger.org> wrote: >> >>> Spark direct stream doesn't have a default partitioner. >>> >>> If you know that you want to do an operation on keys that are already >>> partitioned by kafka, just use mapPartitions or

Re: Spark Streaming data checkpoint performance

2015-11-07 Thread trung kien
Hmm, Seems it just do a trick. Using this method, it's very hard to recovery from failure, since we don't know which batch have been done. I really want to maintain the whole running stats in memory to archive full failure-tolerant. I just wonder if the performance of data checkpoint is that