I think you can build your own Accumulo credential provider as similar to
HadoopDelegationTokenProvider out of Spark, Spark already provided an
interface "ServiceCredentialProvider" for user to plug-in customized
credential provider.
Thanks
Jerry
2018-03-23 14:29 GMT+08:00 Jorge Machado :
> Hi G
Hi Guys,
I’m on the middle of writing a spark Datasource connector for Apache Spark to
connect to Accumulo Tablets, because we have Kerberos it get’s a little trick
because Spark only handles the Delegation Tokens from Hbase, hive and hdfs.
Would be a PR for a implementation of HadoopDelegati
Use a streaming query listener that tracks repetitive progress events for the
same batch id. If x amount of time has elapsed given repetitive progress events
for the same batch id, the source is not providing new offsets and stream
execution is not scheduling new micro batches. See also:
spark.
Hi,
What is the way to stop a Spark Streaming job if there is no data inflow
for an arbitrary amount of time (eg: 2 mins)?
Thanks,
Aakash.
Hi Shmuel,
Did you compile the code against the right branch for Spark 1.6.
I tested it and it looks working and now i'm testing the branch for a wide
tests, Please use the branch for Spark 1.6
On Fri, Mar 23, 2018 at 12:43 AM, Shmuel Blitz
wrote:
> Hi Rohit,
>
> Thanks for sharing this great
Hi:
I am working on a realtime application using spark structured streaming (v
2.2.1). The application reads data from kafka and if there is a failure, I
would like to ignore the checkpoint. Is there any configuration to just read
from last kafka offset after a failure and ignore any offset che
Structured Streaming AUTOMATICALLY saves the offsets in a checkpoint
directory that you provide. And when you start the query again with the
same directory it will just pick up where it left off.
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#recovering-from-failur
Hi:
I am working with Spark (2.2.1) and Kafka (0.10) on AWS EMR and for the last
few days, after running the application for 30-60 minutes get exception from
Kafka Consumer included below.
The structured streaming application is processing 1 minute worth of data from
kafka topic. So I've tried
Yes indeed, we dont directly support schema migration of state as of now.
However, depending on what stateful operator you are using, you can work
around it. For example, if you are using mapGroupsWithState /
flatMapGroupsWithState, you can save explicitly convert your state to
avro-encoded bytes a
I am trying to research a custom Aggregator implementation, and following the
example in the Spark sample code here:
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/sql/UserDefinedTypedAggregation.scala
But I cannot use it in the agg function, and g
Hi Cody,
I am following to implement the exactly once semantics and also utilize
storing the offsets in database. Question I have is how to use hive instead
of traditional datastores. write to hive will be successful even though
there is any issue with saving offsets into DB. Could you please corr
Hi Rohit,
Thanks for sharing this great tool.
I tried running a spark job with the tool, but it failed with an
*IncompatibleClassChangeError
*Exception.
I have opened an issue on Github.(
https://github.com/qubole/sparklens/issues/1)
Shmuel
On Thu, Mar 22, 2018 at 5:05 PM, Shmuel Blitz
wrote:
Thanks.
We will give this a try and report back.
Shmuel
On Thu, Mar 22, 2018 at 4:22 PM, Rohit Karlupia wrote:
> Thanks everyone!
> Please share how it works and how it doesn't. Both help.
>
> Fawaze, just made few changes to make this work with spark 1.6. Can you
> please try building from br
Thanks all!
On Thu, Mar 22, 2018 at 2:08 AM, Jorge Machado wrote:
> DataFrames are not mutable.
>
> Jorge Machado
>
>
> On 22 Mar 2018, at 10:07, Aakash Basu wrote:
>
> Hey,
>
> I faced the same issue a couple of days back, kindly go through the mail
> chain with "*Multiple Kafka Spark Streamin
Thanks everyone!
Please share how it works and how it doesn't. Both help.
Fawaze, just made few changes to make this work with spark 1.6. Can you
please try building from branch *spark_1.6*
thanks,
rohitk
On Thu, Mar 22, 2018 at 10:18 AM, Fawze Abujaber wrote:
> It's super amazing i see
Spark context runs in driver whereas the func inside foreach runs in
executor. You can pass on the param in the func so it is available in
executor
On Thu, 22 Mar 2018 at 8:18 pm, Kamalanathan Venkatesan <
kamalanatha...@in.ey.com> wrote:
> Hello All,
>
>
>
> I have custom parameter say for examp
Hello All,
I have custom parameter say for example file name added to the conf of spark
context example SparkConf.set(INPUT_FILE_NAME, fileName).
I need this value inside foreach performed on an RDD, but the when access
spark context inside foreach, I receive spark context is null exception!
DataFrames are not mutable.
Jorge Machado
> On 22 Mar 2018, at 10:07, Aakash Basu wrote:
>
> Hey,
>
> I faced the same issue a couple of days back, kindly go through the mail
> chain with "Multiple Kafka Spark Streaming Dataframe Join query" as subject,
> TD and Chris has cleared my doubts
Hey,
I faced the same issue a couple of days back, kindly go through the mail
chain with "*Multiple Kafka Spark Streaming Dataframe Join query*" as
subject, TD and Chris has cleared my doubts, it would help you too.
Thanks,
Aakash.
On Thu, Mar 22, 2018 at 7:50 AM, kant kodali wrote:
> Hi All,
Hey Jorge,
Thanks for responding.
Can you elaborate on the user permission part ? HDFS or local ?
As of now, hdfs path ->
hdfs://n2pl-pa-hdn220.xxx.xxx:8020/user/yarn/.sparkStaging/application_1521457397747_0013/__spark_libs__8247917347016008883.zip
already has complete access for yarn user
Hi All,As druid uses Hadoop MapReduce to ingest batch data but I am trying spark for ingesting data into druid taking reference from https://github.com/metamx/druid-spark-batchBut we are stuck at the following error.Application Log:—>2018-03-20T07:54:28,782 INFO [task-runner-0-priority-0] org.apach
Seems to me permissions problems ! Can you check your user / folder
permissions ?
Jorge Machado
> On 22 Mar 2018, at 08:21, nayan sharma wrote:
>
> Hi All,
> As druid uses Hadoop MapReduce to ingest batch data but I am trying spark for
> ingesting data into druid taking reference from
22 matches
Mail list logo