[jira] [Created] (FLINK-32710) The LeaderElection component IDs for running is only the JobID which might be confusing in the log output

2023-07-28 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-32710:
-

 Summary: The LeaderElection component IDs for running is only the 
JobID which might be confusing in the log output
 Key: FLINK-32710
 URL: https://issues.apache.org/jira/browse/FLINK-32710
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.0
Reporter: Matthias Pohl


I noticed that the leader log messages for the jobs use the plain job ID as the 
component ID. That might be confusing when reading the logs since it's a UUID 
with no additional context.

We might want to add a prefix (e.g. {{job-}} to these component IDs.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-07-28 Thread Martijn Visser
+1 (binding)

On Fri, Jul 14, 2023 at 11:59 AM Prabhu Joseph 
wrote:

> *+1 (non-binding)*
>
> Thanks for working on this. We have seen good improvement during the cool
> down period with this feature.
> Below are details on the test results from one of our clusters:
>
> On a scale-out operation, 8 new nodes were added one by one with a gap of
> ~30 seconds. There were 8 restarts within 4 minutes with the default
> behaviour,
> whereas only one with this feature (cooldown period of 4 minutes).
>
> The number of records processed by the job with this feature during the
> restart window is higher (2909764), whereas it is only 1323960 with the
> default
> behaviour due to multiple restarts, where it spends most of the time
> recovering, and also whatever work progressed by the tasks after the last
> successful completed checkpoint is lost.
>
> Metrics Default Adaptive Scheduler Adaptive Scheduler With Cooldown Period
> Remarks
> NumRecordsProcessed 1323960 2909764 1. NumRecordsProcessed metric indicates
> the difference the cool down period brings in. When the job is doing
> multiple restarts, the task spends most of the time recovering, and the
> progress the task made will be lost during the restart.
>
> 2. There is only one restart with Cool Down Period which happened when the
> 8th node got added back.
>
> Job Parallelism 13 -> 20 -> 27 -> 34 -> 41 -> 48 -> 55 → 62 → 69 13 → 69
> NumRestarts 8 1
>
>
>
>
>
>
>
>
> On Wed, Jul 12, 2023 at 8:03 PM Etienne Chauchot 
> wrote:
>
> > Hi all,
> >
> > I'm going on vacation tonight for 3 weeks.
> >
> > Even if the vote is not finished, as the implementation is rather quick
> > and the design discussion had settled, I preferred I implementing
> > FLIP-322 [1] to allow people to take a look while I'm off.
> >
> > [1] https://github.com/apache/flink/pull/22985
> >
> > Best
> >
> > Etienne
> >
> > Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :
> > >
> > > Hi all,
> > >
> > > Would you mind casting your vote to this second vote thread (opened
> > > after new discussions) so that the subject can move forward ?
> > >
> > > @David, @Chesnay, @Robert you took part to the discussions, can you
> > > please sent your vote ?
> > >
> > > Thank you very much
> > >
> > > Best
> > >
> > > Etienne
> > >
> > > Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :
> > >>
> > >> Hi all,
> > >>
> > >> Thanks for your feedback about the FLIP-322: Cooldown period for
> > >> adaptive scheduler [1].
> > >>
> > >> This FLIP was discussed in [2].
> > >>
> > >> I'd like to start a vote for it. The vote will be open for at least 72
> > >> hours (until July 9th 15:00 GMT) unless there is an objection or
> > >> insufficient votes.
> > >>
> > >> [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler
> > >> [2] https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6
> > >>
> > >> Best,
> > >>
> > >> Etienne
>


[jira] [Created] (FLINK-32711) Type mismatch when proctime function used as parameter

2023-07-28 Thread Aitozi (Jira)
Aitozi created FLINK-32711:
--

 Summary: Type mismatch when proctime function used as parameter
 Key: FLINK-32711
 URL: https://issues.apache.org/jira/browse/FLINK-32711
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Aitozi


reproduce case:

{code:sql}
SELECT TYPEOF(PROCTIME())
{code}

this query will fail with 

org.apache.flink.table.planner.codegen.CodeGenException: Mismatch of function's 
argument data type 'TIMESTAMP_LTZ(3) NOT NULL' and actual argument type 
'TIMESTAMP_LTZ(3)'.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Flink and Externalized connectors leads to block and circular dependency problems

2023-07-28 Thread Jing Ge
Hi Ran Tao,

What is the current status? @Dian There were many options. Which one is the
most feasible one you prefer?

Best regards,
Jing

On Fri, Jul 7, 2023 at 2:37 PM Mason Chen  wrote:

> Hi all,
>
> I also agree with what's been said above.
>
> +1, I think the Table API delegation is a good suggestion--it essentially
> allows a connector to get Python support for free. We've seen that
> Table/SQL and Python APIs complement each other well and are ideal for data
> scientists. With respect to unaligned functionalities, I think that also
> holds true for other APIs, e.g. Datastream and Table/SQL since there is
> functionality that is not natural to represent as a configuration/SQL.
>
> Best,
> Mason
>
> On Wed, Jul 5, 2023 at 10:14 PM Dian Fu  wrote:
>
> > Hi Chesnay,
> >
> > >> The wrapping of connectors is a bit of a maintenance nightmare and
> > doesn't really work with external/custom connectors.
> >
> > Cannot agree with you more.
> >
> > >> Has there ever been thoughts about changing flink-pythons connector
> > setup to use the table api connectors underneath?
> >
> > I'm still not sure if this is feasible for all connectors, however,
> > this may be a good idea. The concern is that the DataStream API
> > connectors functionalities may be unaligned between Java and Python
> > connectors. Besides, there are still a few connectors which only have
> > DataStream API connectors, e.g. Google PubSub, RabbitMQ, Cassandra,
> > Pulsar, Hybrid Source, etc. Besides, it currently already supports
> > Table API connectors in PyFlink and if we take this way, maybe we
> > could just tell users to use Table API connector directly.
> >
> > Another option in my head before is to provide an API which allows
> > configuring the behavior via key/value pairs in both the Java & Python
> > DataStream API connectors.
> >
> > Regards,
> > Dian
> >
> > On Wed, Jul 5, 2023 at 6:34 PM Chesnay Schepler 
> > wrote:
> > >
> > > Has there ever been thoughts about changing flink-pythons connector
> > > setup to use the table api connectors underneath?
> > >
> > > The wrapping of connectors is a bit of a maintenance nightmare and
> > > doesn't really work with external/custom connectors.
> > >
> > > On 04/07/2023 13:35, Dian Fu wrote:
> > > > Thanks Ran Tao for proposing this discussion and Martijn for sharing
> > > > the thought.
> > > >
> > > >>   While flink-python now fails the CI, it shouldn't actually depend
> > on the
> > > > externalized connectors. I'm not sure what PyFlink does with it, but
> if
> > > > belongs to the connector code,
> > > >
> > > > For each DataStream connector, there is a corresponding Python
> wrapper
> > > > and also some test cases in PyFlink. In theory, we should move that
> > > > wrapper into each connector repository. In the past, we have not done
> > > > that when externalizing the connectors since it may introduce some
> > > > burden when releasing since it means that we have to publish each
> > > > connector to PyPI separately.
> > > >
> > > > To resolve this problem, I guess we can move the connector support in
> > > > PyFlink into the external connector repository.
> > > >
> > > > Regards,
> > > > Dian
> > > >
> > > >
> > > > On Mon, Jul 3, 2023 at 11:08 PM Ran Tao 
> wrote:
> > > >> @Martijn
> > > >> thanks for clear explanations.
> > > >>
> > > >> If we follow the line you specified (Connectors shouldn't rely on
> > > >> dependencies that may or may not be
> > > >> available in Flink itself)
> > > >> It seems that we should add a certain dependency if we need(such as
> > > >> commons-io, commons-collection) in connector pom explicitly.
> > > >> And bundle it in sql-connector uber jar.
> > > >>
> > > >> Then there is only one thing left that we need to make flink-python
> > test
> > > >> not depend on the released flink-connector.
> > > >> Maybe we should check it out and decouple it like you suggested.
> > > >>
> > > >> Best Regards,
> > > >> Ran Tao
> > > >> https://github.com/chucheng92
> > > >>
> > > >>
> > > >> Martijn Visser  于2023年7月3日周一 22:06写道:
> > > >>
> > > >>> Hi Ran Tao,
> > > >>>
> > > >>> Thanks for opening this topic. I think there's a couple of things
> at
> > hand:
> > > >>> 1. Connectors shouldn't rely on dependencies that may or may not be
> > > >>> available in Flink itself, like we've seen with flink-shaded. That
> > avoids a
> > > >>> tight coupling between Flink and connectors, which is exactly what
> > we try
> > > >>> to avoid.
> > > >>> 2. When following that line, that would also be applicable for
> > things like
> > > >>> commons-collections and commons-io. If a connector wants to use
> > them, it
> > > >>> should make sure that it bundles those artifacts itself.
> > > >>> 3. While flink-python now fails the CI, it shouldn't actually
> depend
> > on the
> > > >>> externalized connectors. I'm not sure what PyFlink does with it,
> but
> > if
> > > >>> belongs to the connector code, that code should also be moved to
> the
> > > >>> individual connector repo. If it's

FLINK-20767 - Support for nested fields filter push down

2023-07-28 Thread Venkatakrishnan Sowrirajan
Hi all,

Currently, I am working on adding support for nested fields filter push
down. In our use case running Flink on Batch, we found nested fields filter
push down is key - without it, it is significantly slow. Note: Spark SQL
supports nested fields filter push down.

While debugging the code using IcebergTableSource as the table source,
narrowed down the issue to missing support for
RexNodeExtractor#RexNodeToExpressionConverter#visitFieldAccess.
As part of fixing it, I made changes by returning an
Option(FieldReferenceExpression)
with appropriate reference to the parent index and the child index for the
nested field with the data type info.

But this new ResolvedExpression cannot be converted to RexNode which
happens in PushFilterIntoSourceScanRuleBase

.

Few questions

1. Does FieldReferenceExpression support nested fields currently or should
it be extended to support nested fields? I couldn't figure this out from
the PushProjectIntoTableScanRule that supports nested column projection
push down.
2. ExpressionConverter

converts ResolvedExpression -> RexNode but the new FieldReferenceExpression
with the nested field cannot be converted to RexNode. This is why the
answer to the 1st question is key.
3. Anything else that I'm missing here? or is there an even easier way to
add support for nested fields filter push down?

Partially working changes - Commit

Please
feel free to leave a comment directly in the commit.

Any pointers here would be much appreciated! Thanks in advance.

Disclaimer: Relatively new to Flink code base especially Table planner :-).

Regards
Venkata krishnan