Thanks.
But there is a problem that the classes referenced in the code need to be
modified. I want to try not to change the existing code.
Gabor Somogyi 于2020年12月25日周五 上午12:16写道:
> One can wrap the JDBC driver and such a way eveything can be sniffed.
>
> On Thu, 24 Dec 2020, 03:51
Hi:
guys, I have some spark programs that have database connection
operations. I want to acquire the connection information, such as jdbc
connection properties , but not too intrusive to the code.
Any good ideas ? Can java agent make it ?
Any more detail about it ?
bannya 于2020年12月18日周五 上午11:25写道:
> Hi,
>
> I have a spark structured streaming application that is reading data from a
> Kafka topic (16 partitions). I am using standalone mode. I have two workers
> node, one node is on the same machine with masters and another one is
the logs printed in the map function exist in the worker node, you can
access it directly, or you can browse through webui.
abby37 于2020年12月23日周三 下午1:53写道:
> I want to print some logs in transformation mapPartitions to logs the
> internal working of function.
> I have used following
If you can not assembly the jdbc driver jar in your application jar
package, you can put the jdbc driver jar in the spark classpath, generally,
$SPARK_HOME/jars or $SPARK_HOME/lib.
Artemis User 于2020年12月11日周五 上午5:21写道:
> What happened was that you made the mysql jar file only available to the
you can use *foreach* sink to achieve the logic you want.
act_coder 于2020年11月4日周三 下午9:56写道:
> I have a scenario where I would like to save the same streaming dataframe
> to
> two different streaming sinks.
>
> I have created a streaming dataframe which I need to send to both Kafka
> topic and
Is the connection pool configured by mongodb full?
Daniel Stojanov 于2020年10月26日周一 上午10:28写道:
> Hi,
>
>
> I receive an error message from the MongoDB server if there are too many
> Spark applications trying to access the database at the same time (about
> 3 or 4), "Cannot open a new cursor since
transaction bache size.
KhajaAsmath Mohammed 于2020年10月21日周三 下午12:19写道:
> Yes. Changing back to latest worked but I still see the slowness compared
> to flume.
>
> Sent from my iPhone
>
> On Oct 20, 2020, at 10:21 PM, lec ssmi wrote:
>
>
> Do you start your application wi
Do you start your application with chasing the early Kafka data ?
Lalwani, Jayesh 于2020年10月21日周三 上午2:19写道:
> Are you getting any output? Streaming jobs typically run forever, and keep
> processing data as it comes in the input. If a streaming job is working
> well, it will typically generate
sorry, the mail title is a little problematic. "How to disable or
replace .."
lec ssmi 于2020年10月14日周三 上午9:27写道:
> I have written a demo using spark3.0.0, and the location where the
> checkpoint file is saved has been explicitly specified like
>>
>> strea
I have written a demo using spark3.0.0, and the location where the
checkpoint file is saved has been explicitly specified like
>
> stream.option("checkpointLocation","file:///C:\\Users\\Administrator\\
> Desktop\\test")
But the app still throws an exception about the HDFS file system.
Is
For example, put the generated query into a list and start every one, then
use the method awaitTermination() on the last one .
Abhisheks 于2020年5月1日周五 上午10:32写道:
> I hope you are using the Query object that is returned by the Structured
> streaming, right?
> Returned object contains a lot of
for this situation?
Best
Lec Ssmi
pr 28, 2020 at 9:25, Jungtaek Lim
> wrote:
> The root cause of exception is occurred in executor side "Lost task 10.3
> in stage 1.0 (TID 81, spark6, executor 1)" so you may need to check there.
>
> On Tue, Apr 28, 2020 at 2:52 PM lec ssmi wrote:
>
> Hi:
> On
on$streaming$StreamExecution$$runStream(StreamExecution.scala:279)
> ... 1 more
According to the exception stack, it seems to have nothing to do with the
logic of my code.Is this a spark bug or something? The version of spark is
2.3.1.
Best
Lec Ssmi
as been expired. What is the solution for this?
>
> On Thu, Mar 19, 2020 at 10:53 AM lec ssmi wrote:
>
>> 1.Maybe we can't use customized group id in structured streaming.
>> 2.When restarting from failure or killing , the group id changes, but the
>> starting off
1.Maybe we can't use customized group id in structured streaming.
2.When restarting from failure or killing , the group id changes, but the
starting offset will be the last one you consumed last time .
Srinivas V 于2020年3月19日周四 下午12:36写道:
> Hello,
> 1. My Kafka consumer name is randomly being
maybe you can combine the fields you want to use into one field
Something Something 于2020年3月3日周二 上午6:37写道:
> I am writing a Stateful Streaming application in which I am using
> mapGroupsWithState to create aggregates for Groups but I need to create
> *Groups
> based on more than one column in
Such as :
df.withWarmark("time","window
size").dropDulplicates("id").withWatermark("time","real
watermark").groupBy(window("time","window size","window
size")).agg(count("id&
ake count(distinct count) success?
Tathagata Das 于2020年2月28日周五 上午10:25写道:
> 1. Yes. All times in event time, not processing time. So you may get 10AM
> event time data at 11AM processing time, but it will still be compared
> again all data within 9-10AM event times.
>
> 2. Show
Hi:
I'm new to structured streaming. Because the built-in API cannot
perform the Count Distinct operation of Window, I want to use
dropDuplicates first, and then perform the window count.
But in the process of using, there are two problems:
1. Because it is streaming computing,
21 matches
Mail list logo