Hi Timo,
Thanks for the link.
Not sure i understand your suggestion though, is the goal here reducing the
amount of parameters coming to the UDF? if thats the case i can maybe have
the tag names there, but still need the expressions to get evaluated before
entering the eval. Do you see this in a d
Im running SQL along with a a custom trigger - so think about this scenario:
Select id, count(*) as count_events, atLeastOneIsTrue(booleanField) as
shouldTrigger
FROM my_table
GROUP BY id
transforming it to a retracted stream and then filtering by the
shouldTrigger field, that works as expected.
BTW looking at past posts on this issue[1] it should have been fixed? i'm
using version 1.7.2
Also the recommendation was to use a custom function, though that's exactly
what im doing with the conditionalArray function[2]
Thanks!
[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabbl
In a subsequent run i get
Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method
"split$3681$(LDataStreamCalcRule$3682;)V" of class "DataStreamCalcRule$3682"
grows beyond 64 KB
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hey,
While running a SQL query i get an OutOfMemoryError exception and "Table
program cannot be compiled" [2].
In my scenario i'm trying to enrich an event using an array of tags, each
tag has a boolean classification (like a WHERE clause) and with a custom
function i'm filtering the array to keep
Just to be more clear on my goal -
Im trying to enrich the incoming stream with some meaningful tags based on
conditions from the event itself.
So the input stream could be an event looks like:
Class Car {
int year;
String modelName;
}
i will have a config that are defining tags as:
"NiceCar"
Hey,
I'm trying to create a SQL query which, given input from a stream with
generic class T type will create a new stream which will be in the structure
of
{
origin : T
resultOfSomeSQLCalc : Array[String]
}
it seems that just by doing "SELECT *" i can convert the resulting table
back to a st
Apparently the solution is to force map creating using UDF and to have the
UDF return Types.GENERIC(Map.class)
That makes them compatible and treated both as GenericType
Thanks!
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Im trying to convert a SQL query that has a select map[..] into a pojo with
Map (using tableEnv.toRestractedStream )
It seems to fail when the field requestedTypeInfo is GenericTypeInfo with
GenericType while the field type itself is MapTypeInfo with
Map
Exception in thread "main" org.apache.flin
Sorry to flood this thread, but keeping my experiments:
so far i've been using retract to a Row and then mapping to a dynamic pojo
that is created (using ByteBuddy) according to the select fields in the SQL.
Considering the error I'm trying now to remove thr usage in Row and use the
dynamic type d
Debugging locally it seems like the state descriptor of "GroupAggregateState"
is creating an additional field (TypleSerializer of SumAccumulator)
serializer within the RowSerializer. Im guessing this is what causing
incompatibility? Is there any work around i can do?
--
Sent from: http://apache-
Hi Fabian,
It seems like it didn't work.
Let me specify what i have done:
i have a SQL that looks something like:
Select a, sum(b), map[ 'sum_c', sum(c), 'sum_d', sum(d)] as my_map FROM...
GROUP BY a
As you said im preventing keys in the state forever by doing idle state
retention time (+ im tr
Looking further into the RowType it seems like this field is translated as a
CURSOR rather than a map.. not sure why
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hey,
As im building a SQL query, im trying to conditionally build a map such that
there won't be any keys with null values in it. AFAIK from Calcite there's
no native way to do it (other than using case to build the map in different
ways, but then i have a lot of key/value pairs so thats not reaso
Thanks Rong,
I have made some quick test changing the SQL select (adding a select field
in the middle) and reran the job from a savepoint and it worked without any
errors. I want to make sure i understand how at what point the state is
stored and how does it work.
Let's simplify the scenario and
Hey,
My job is built on SQL that is injected as an input to the job. so lets take
an example of
Select a,max(b) as MaxB,max(c) as MaxC FROM registered_table GROUP BY a
(side note: in order for the state not to grow indefinitely i'm transforming
to a retracted stream and filtering based on a cus
Hey All,
Just read the excellent monitoring blog post
https://flink.apache.org/news/2019/02/25/monitoring-best-practices.html
I'm looking on reducing the number of unique metrics, especially on items i
can compromise on consolidating such as using indices instead of ids.
Specifically looking at t
Hey Jayant. Getting the same using gradle. my metrics reporter and my
application both using the flink-metrics-dropwizard dependency for reporting
Meters. how should i be solving it?
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
OK, thanks for the help
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
It seems like the operator name for a SQL group by is the query string
itself. I get
"The operator name groupBy: (myGroupField), select: (myGroupField,
someOther... )... exceeded the 80 characters length limit and was truncated"
Is there a way to name the SQL query operator?
--
Sent from: http
Actually was looking at the task manager level, i did have more slots than
shards, so it does make sense i had an idle task manager while other task
managers split the shards between their slots
Thanks!
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Looking at the doc
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kinesis.html
"When the number of shards is larger than the parallelism of the consumer,
then each consumer subtask can subscribe to multiple shards; otherwise if
the number of shards is smaller than the paralle
If it's running in parallel aren't you just adding readers which maxes out
your provisioned throughput? probably doesn't belong in here but rather a
Kinesis thing, but i suggest increasing your number of shards?
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Thanks Fabian!
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
I have a scenario in which i do a non-windowed group by using SQL. something
like
"Select count(*) as events, shouldTrigger(..) as shouldTrigger from source
group by sessionId"
i'm then converting to a retracted stream, filtering by "add" messages, then
further filtering by "shouldTrigger" field a
following up on the actual question - is there a way to register a
keyedstream as table(s) and have a trigger per key?
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
OK I think i figured it out - not sure though exactly the reason:
It seems that i need to have a stream type - Generic Type of the super class
- rather than a Pojo of the concrete generated class. It seems like the
operation definition otherwise cannot load the Pojo class on the task
creation.
So
Update on this - if i just do empty mapping and drop the sql part, it works
just fine. i wonder if there's any class loading that needs to be done when
using SQL, not sure how i do that
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
After removing some operators (which i still need, but wanted to understand
where my issues are) i get a slightly different stacktrace (though still
same issue).
my current operators are
1. a sql select with group by (returns retracted stream )
2. filter (take only non retracted)
3. map (tuple t
Hey,
I'm trying to run a job which uses a dynamically generated class (through
Byte Buddy).
think of me having a complex schema as yaml text and generating a class from
it. Throughout the job i am using an artificial super class (MySuperClass)
of the generated class (as for example i need to speci
Hey
Say im aggregating an event stream by sessionId in SQL and im emitting the
results once the session is "over", i guess i should be using Fire and Purge
- i dont expect to need to session data once over. How should i treat the
Idle state retention time - is it needed at all if im using purge? w
Hey!
I have a use case in which im grouping a stream by session id - so far
pretty standard, note that i need to do it through SQL and not by the table
api.
In my use case i have 2 trigger conditions though - while one is time
(session inactivity) the other is based on a specific event marked as a
32 matches
Mail list logo