PySpark app which writes in parallel.
It would help if you can reproduce this and give us a shareable code
snippet here.
All the best,
Farhan.
On Fri, May 12, 2023 at 10:17 AM Karthick Nk wrote:
> Hi Farhan,
> Thank you for your response, I am using databricks with 11.3x-scala2.12.
>
&g
this is the case.
Thanks,
Farhan.
On Thu, May 11, 2023 at 2:54 PM Jacek Laskowski wrote:
> Hi Karthick,
>
> Sorry to say it but there's not enough "data" to help you. There should be
> something more above or below this exception snippet you posted that coul
say about this.
Thanks for looking into it :)
Regards,
Farhan.
On Fri, Apr 30, 2021 at 7:01 PM Mich Talebzadeh
wrote:
> Hi Farhan,
>
> I have used it successfully and it works. The only thing that potentially
> can cause this issue is the jdbc driver itself. Have you tried another
&g
Hi Mich,
I have tried this already. I am using the same methods you are using in my
Java code. I see the same error, where 'dbtable' or 'query' gets added as a
connection property in the JDBC connection string for the source db, which
is AAS in my case.
Thanks,
Farhan.
Hi Anbutech,
If I am not mistaken, I believe you are trying to read multiple
dataframes from around 150 different paths (in your case the Kafka
topics) to count their records. You have all these paths stored in a
CSV with columns year, month, day and hour.
Here is what I came up with; I have been
I tried running mllib k-means with 20newsgroups data set from sklearn. On a
5000 document data set I get one cluster with most of the documents and
other clusters just have handful of documents.
#code
newsgroups_train =
fetch_20newsgroups(subset='train',random_state=1,remove=('headers',
'footers',