Hi,
I can use redis but I’m still having hard time figuring out how I can
eliminate duplicate data. Today without broadcast state in 1.4 I’m using
cache to lazy load the data. I thought the broadcast state will be similar
to that of kafka streams where I have read access to the state across the
pi
Hi Oliwer,
>From the description, Seems the state didn't be cleared, maybe you could
check how many {{windowState.clear()}} was triggered in
{{WindowOperator#processElement}}, and try to figure it out why the state
did not be cleared.
Best,
Congxian
Oliwer Kostera 于2019年9月27日周五 下午4:14写道:
> Hi
Hi,
Could you use some cache system such as HBase or Reids to storage this
data, and query from the cache if needed?
Best,
Congxian
Navneeth Krishnan 于2019年10月1日周二 上午10:15写道:
> Thanks Oytun. The problem with doing that is the same data will be have to
> be stored multiple times wasting memory
Hi Nishant,
On a brief look. I think this is a problem with your 2nd query:
>
> *Table2*...
> Table latestBadIps = tableEnv.sqlQuery("SELECT MAX(b_proctime) AS
> mb_proctime, bad_ip FROM BadIP ***GROUP BY bad_ip***HAVING
> MIN(b_proctime) > CURRENT_TIMESTAMP - INTERVAL '2' DAY ");
> tableEnv.regi
Hi all,
We are running some of our Flink jobs with Job Manager High Availability.
Occasionally we get a cluster that comes up improperly and doesn’t respond.
Attempts to submit the job seem to hang and when we hit the /overview REST
endpoint in the Job Manager we get a 500 error and a fencing t
Hi everyone.
I wanted to ask Flink users what are the most subtle Flink bugs that
people have witnessed. The cause of the bugs could be anything (e.g.
wrong assumptions on data, parallelism of non-parallel operator, simple
mistakes).
We are developing a testing framework for Flink and it would
Thanks Oytun. The problem with doing that is the same data will be have to
be stored multiple times wasting memory. In my case there will around
million entries which needs to be used by at least two operators for now.
Thanks
On Mon, Sep 30, 2019 at 5:42 PM Oytun Tez wrote:
> This is how we cur
This is how we currently use broadcast state. Our states are re-usable
(code-wise), every operator that wants to consume basically keeps the same
descriptor state locally by processBroadcastElement'ing into a local state.
I am open to suggestions. I see this as a hard drawback of dataflow
programm
You can re-use the broadcasted state (along with its descriptor) that comes
into your KeyedBroadcastProcessFunction, in another operator downstream.
that's basically duplicating the broadcasted state whichever operator you
want to use, every time.
---
Oytun Tez
*M O T A W O R D*
The World's Fas
Hi All,
Is it possible to access a broadcast state across the pipeline? For
example, say I have a KeyedBroadcastProcessFunction which adds the incoming
data to state and I have downstream operator where I need the same state as
well, would I be able to just read the broadcast state with a readonly
Hi so here is what I have done...
1- I load my CSV using CSV table source.
2- 1 setup Kafka stream to read my incoming events.
3- Map my events to a POJO
4- Join the 2 tables
5- Push the joined result to Elastic search.
This works absolutely fine. So whats the difference between this and the
prop
Vijay,
That is my understanding as well: the HA solution only solves the problem
up to the point all job managers fail/restart at the same time. That's
where my original concern was.
But to Aleksandar and Yun's point, running in HA with 2 or 3 Job Managers
per cluster--as long as they are all dep
Ok thanks. It's basically telephone area codes, they barely ever change.
On Mon, 30 Sep 2019 at 06:03, Gaël Renoux wrote:
> Hi John,
>
> I've had a similar requirement, and I've resorted to simply use a static
> cache (I'm coding in Scala, so that's a lazy value on a singleton object -
> in Java
Thanks for your information Dian. It is not an urgent issue though. Maybe
revisit later :-)
Best,
tison.
Dian Fu 于2019年9月30日周一 下午7:34写道:
> Hi tison,
>
> Actually there may be compatibility issues as the
> BatchTableEnvironment/StreamTableEnvironment under "api.java" are public
> interfaces.
>
Hi tison,
Actually there may be compatibility issues as the
BatchTableEnvironment/StreamTableEnvironment under "api.java" are public
interfaces.
Regards,
Dian
> 在 2019年9月30日,下午4:49,Zili Chen 写道:
>
> Hi Dian,
>
> What about rename api.java to japi if there is no unexpected compatibility
> i
Hi John,
I've had a similar requirement, and I've resorted to simply use a static
cache (I'm coding in Scala, so that's a lazy value on a singleton object -
in Java that would be a static value on some utility class, with a
synchronized lazy-loading getter). The value is reloaded after some
durati
Hi Team,
I am trying to Join [kafka stream] and [badip stream grouped with badip]
Can someone please help me out with verifying what is wrong in
highlighted query. Am I writing the time window join query wrong with this
use case.? Or it is a bug and i should report this
what is the work around, i
Hi Dian,
What about rename api.java to japi if there is no unexpected compatibility
issue? I think we can always avoid use a `.java.` in package names.
Best,
tison.
Dian Fu 于2019年9月26日周四 下午10:54写道:
> Hi Nick,
>
> There is a package named "org.apache.flink.table.api.java" in flink and
> so the
18 matches
Mail list logo