Thanks Nicholas, but the problem for us is that we want to use NLTK Python
library, since our data scientists are training using that. Rewriting the
inference logic using some other library would be time consuming and in
some cases, it may not even work because of unavailability of some
functions.
nvm, I see it. It's http://localhost:8998
On Fri, Dec 1, 2017 at 3:28 PM, kant kodali wrote:
> Hi All,
>
> I am running both spark and livy locally so imagine everything on a local
> machine.
> what should my livyUrl be set to? I don't see that in the example.
>
> Thanks!
>
Hi All,
I am running both spark and livy locally so imagine everything on a local
machine.
what should my livyUrl be set to? I don't see that in the example.
Thanks!
Hello Burak,
Sorry to the delayed answer, you were right.
1) - I change the sql-kafka connector version and fixed.
2) - The propose was just test, and I was using normal streaming also for
other thing.
I'm was wondering how did you know was the sql-kafka connector version
reading the logs. I
Yeah, don't mix multiple versions of kafka clients. That's not 100%
certain to be the cause of your problem, but it can't be helping.
As for your comments about async commits, read
https://issues.apache.org/jira/browse/SPARK-22486
and if you think your use case is still relevant to others
Hadoop trunk (i.e 3.1 when it comes out), has the code to do 0-rename commits
http://steveloughran.blogspot.co.uk/2017/11/subatomic.html
if you want to play today, you can build Hadoop trunk & spark master, + a
little glue JAR of mine to get Parquet to play properly
In your case, it looks it’s trying to make 2 versions Kafka existed in the same
JVM at runtime. There is version conflict.
About “I dont find the spark async commit useful for our needs”, do you mean
to say the code like below?
kafkaDStream.asInstanceOf[CanCommitOffsets].commitAsync(ranges)