[jira] [Created] (FLINK-25834) 'flink run' command can not use 'pipeline.classpaths' in flink-conf.yaml
Ada Wong created FLINK-25834: Summary: 'flink run' command can not use 'pipeline.classpaths' in flink-conf.yaml Key: FLINK-25834 URL: https://issues.apache.org/jira/browse/FLINK-25834 Project: Flink Issue Type: Bug Components: Client / Job Submission, Command Line Client Affects Versions: 1.14.3 Reporter: Ada Wong When we use 'flink run' or CliFrontend class to submit job. If not set -C/-classpaths, it disable 'pipeline.classpaths'. Example: flink-conf.yaml content : {code:java} pipeline.classpaths: file:///opt/flink-1.14.2/other/flink-sql-connector-elasticsearch7_2.12-1.14.2.jar{code} submit command: {code:java} bin/flink run /flink14-sql/target/flink14-sql-1.0-SNAPSHOT-jar-with-dependencies.jar{code} it will throw elasticsearch7 class not found exception. There are two reasons for this: # ProgramOptions#applyToConfiguration will use a list which size is zero to overwrite 'pipeline.classpaths' value in configuration. # ProgramOptions#buildProgram do not set 'pipeline.classpaths' into PackagedProgram. To solve the 1) problem, could we add a directly return judgement when list size is zero in ConfigUtils#encodeCollectionToConfig() To solve the 2) problem, could we append 'pipeline.classpaths' values into classpaths and pass setUserClassPaths together. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25804) Loading and running connector code use separated ClassLoader.
Ada Wong created FLINK-25804: Summary: Loading and running connector code use separated ClassLoader. Key: FLINK-25804 URL: https://issues.apache.org/jira/browse/FLINK-25804 Project: Flink Issue Type: New Feature Components: API / Core, Connectors / Common, Table SQL / Runtime Affects Versions: 1.14.3 Reporter: Ada Wong When we use multiple connectors could have class conflicts. This class conflict can not be solved by shade. The following is example code. CREATE TABLE es6 ( user_id STRING, user_name STRING, PRIMARYKEY (user_id) NOT ENFORCED ) WITH ( 'connector' = 'elasticsearch-6', 'hosts' = 'http://localhost:9200', 'index' = 'users', 'document-type' = 'foo' ); CREATE TABLE es7 ( user_id STRING, user_name STRING, PRIMARYKEY (user_id) NOT ENFORCED ) WITH ( 'connector' = 'elasticsearch-7', 'hosts' = 'http://localhost:9200', 'index' = 'users' ); CREATE TABLE ods ( user_id STRING, user_name STRING ) WITH ( 'connector' = 'datagen' ); INSERT INTO es6 SELECT user_id, user_name FROM ods; INSERT INTO es7 SELECT user_id, user_name FROM ods; {code:java} CREATE TABLE es6 ( user_id STRING, user_name STRING, PRIMARYKEY (user_id) NOT ENFORCED ) WITH ( 'connector' = 'elasticsearch-6', 'hosts' = 'http://localhost:9200', 'index' = 'users', 'document-type' = 'foo' ); CREATE TABLE es7 ( user_id STRING, user_name STRING, PRIMARYKEY (user_id) NOT ENFORCED ) WITH ( 'connector' = 'elasticsearch-7', 'hosts' = 'http://localhost:9200', 'index' = 'users' ); CREATE TABLE ods ( user_id STRING, user_name STRING ) WITH ( 'connector' = 'datagen' ); INSERT INTO es6 SELECT user_id, user_name FROM ods; INSERT INTO es7 SELECT user_id, user_name FROM ods;{code} Inspird by PulginManager, PluginFileSystemFactory and ClassLoaderFixingFileSystem class. Could we create many ClassLoaderFixing* class to avoid class conflict. Such as ClassLoaderFixingDynamicTableFactory, ClassLoaderFixingSink or ClassLoaderFixingSinkFunction. If we use ClassLoader fixing, each call SinkFunction#invoke will switch classloader by Thread#currentThread()#setContextClassLoader(). Does setContextClassLoader() has heavy overhead of setContextClassLoader()? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25795) Support Pulsar sink connector in Python DataStream API.
Ada Wong created FLINK-25795: Summary: Support Pulsar sink connector in Python DataStream API. Key: FLINK-25795 URL: https://issues.apache.org/jira/browse/FLINK-25795 Project: Flink Issue Type: New Feature Components: API / Python, Connectors / Pulsar Affects Versions: 1.14.3 Reporter: Ada Wong [https://issues.apache.org/jira/secure/ViewProfile.jspa?name=knaufk] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25601) Update 'state.backend' in flink-conf.yaml
Ada Wong created FLINK-25601: Summary: Update 'state.backend' in flink-conf.yaml Key: FLINK-25601 URL: https://issues.apache.org/jira/browse/FLINK-25601 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Affects Versions: 1.14.2 Reporter: Ada Wong The value and comments of 'state.backend' in flink-conf.yaml is deprecated. {code:java} # Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the # . # # state.backend: filesystem{code} We should update to this following. {code:java} # Supported backends are 'hashmap', 'rocksdb', or the # . # # state.backend: hashmap {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25530) Support Pulsar source connector in Python DataStream API.
Ada Wong created FLINK-25530: Summary: Support Pulsar source connector in Python DataStream API. Key: FLINK-25530 URL: https://issues.apache.org/jira/browse/FLINK-25530 Project: Flink Issue Type: New Feature Components: API / Python Affects Versions: 1.14.2 Reporter: Ada Wong Flink have supported Pulsar source connector. https://issues.apache.org/jira/browse/FLINK-20726 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25485) JDBC connector implicitly add options when use mysql
Ada Wong created FLINK-25485: Summary: JDBC connector implicitly add options when use mysql Key: FLINK-25485 URL: https://issues.apache.org/jira/browse/FLINK-25485 Project: Flink Issue Type: Improvement Components: Connectors / JDBC Affects Versions: 1.14.2 Reporter: Ada Wong When we directly use mysql sink, buffer-flush options(sink.buffer-flush.max-rows, sink.buffer-flush.interval) can not increase throughput. We must set 'rewriteBatchedStatements=true' into mysql jdbc url, for 'sink.buffer-flush' to take effect. I want add a method appendUrlSuffix in MySQLDialect to auto set this jdbc option. Many users forget or don't know this option. Inspired by alibaba DataX. https://github.com/alibaba/DataX/blob/master/plugin-rdbms-util/src/main/java/com/alibaba/datax/plugin/rdbms/util/DataBaseType.java -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25434) Throw an error when BigDecimal precision overflows.
Ada Wong created FLINK-25434: Summary: Throw an error when BigDecimal precision overflows. Key: FLINK-25434 URL: https://issues.apache.org/jira/browse/FLINK-25434 Project: Flink Issue Type: Bug Components: Table SQL / Planner, Table SQL / Runtime Affects Versions: 1.14.2 Reporter: Ada Wong Lost a lot of data but no error was thrown. As the following comment, If the precision overflows, null will be returned. {code:java} /** If the precision overflows, null will be returned. */ public static @Nullable DecimalData fromBigDecimal(BigDecimal bd, int precision, int scale) { bd = bd.setScale(scale, RoundingMode.HALF_UP); if (bd.precision() > precision) { return null; } long longVal = -1; if (precision <= MAX_COMPACT_PRECISION) { longVal = bd.movePointRight(scale).longValueExact(); } return new DecimalData(precision, scale, longVal, bd); } {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25240) Update log4j2 version to 2.15.0
Ada Wong created FLINK-25240: Summary: Update log4j2 version to 2.15.0 Key: FLINK-25240 URL: https://issues.apache.org/jira/browse/FLINK-25240 Project: Flink Issue Type: Bug Components: API / Core Affects Versions: 1.14.0 Reporter: Ada Wong 2.0 <= Apache log4j2 <= 2.14.1 have a RCE zero day. https://www.cyberkendra.com/2021/12/worst-log4j-rce-zeroday-dropped-on.html -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25239) Delete useless variables
Ada Wong created FLINK-25239: Summary: Delete useless variables Key: FLINK-25239 URL: https://issues.apache.org/jira/browse/FLINK-25239 Project: Flink Issue Type: Improvement Components: Connectors / JDBC Affects Versions: 1.14.0 Reporter: Ada Wong public static final int DEFAULT_FLUSH_MAX_SIZE = 5000; public static final long DEFAULT_FLUSH_INTERVAL_MILLS = 0L; -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25188) Cannot install PyFlink in M1 CPU
Ada Wong created FLINK-25188: Summary: Cannot install PyFlink in M1 CPU Key: FLINK-25188 URL: https://issues.apache.org/jira/browse/FLINK-25188 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.14.0 Reporter: Ada Wong ERROR: Could not find a version that satisfies the requirement pandas<1.2.0,>=1.0 (from apache-flink) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25141) Elasticsearch connector customize sink parallelism
Ada Wong created FLINK-25141: Summary: Elasticsearch connector customize sink parallelism Key: FLINK-25141 URL: https://issues.apache.org/jira/browse/FLINK-25141 Project: Flink Issue Type: Improvement Components: Connectors / ElasticSearch Affects Versions: 1.14.0 Reporter: Ada Wong Inspired by JDBC and Kafka connector, add a 'sink.parallelism' option, and using SinkProvider#of(sink, sinkParallelism). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-24661) ConfigOption add isSecret method to judge sensitive options
Ada Wong created FLINK-24661: Summary: ConfigOption add isSecret method to judge sensitive options Key: FLINK-24661 URL: https://issues.apache.org/jira/browse/FLINK-24661 Project: Flink Issue Type: Improvement Components: API / Core Affects Versions: 1.13.3 Reporter: Ada Wong Related ticket https://issues.apache.org/jira/browse/FLINK-24381 [~chesnay] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24381) Hidden password value when Flink SQL connector throw exception.
Ada Wong created FLINK-24381: Summary: Hidden password value when Flink SQL connector throw exception. Key: FLINK-24381 URL: https://issues.apache.org/jira/browse/FLINK-24381 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Affects Versions: 1.13.2 Reporter: Ada Wong This following is error message. Password is 'bar' and is displayed. Could we hidden it to password='***' or password='' inspired by Apache Kafka source code. {code:java} Missing required options are: hosts Unable to create a sink for writing table 'default_catalog.default_database.dws'. Table options are: 'connector'='elasticsearch7-x' 'index'='foo' 'password'='bar' at org.apache.flink.table.factories.FactoryUtil.createTableSink(FactoryUtil.java:208) at org.apache.flink.table.planner.delegation.PlannerBase.getTableSink(PlannerBase.scala:369) at org.apache.flink.table.planner.delegation.PlannerBase.translateToRel(PlannerBase.scala:221) at org.apache.flink.table.planner.delegation.PlannerBase.$anonfun$translate$1(PlannerBase.scala:159 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23324) Postgres of JDBC Connector enable case-sensitive.
Ada Wong created FLINK-23324: Summary: Postgres of JDBC Connector enable case-sensitive. Key: FLINK-23324 URL: https://issues.apache.org/jira/browse/FLINK-23324 Project: Flink Issue Type: Bug Components: Connectors / JDBC Affects Versions: 1.12.4, 1.13.1 Reporter: Ada Wong Now the PostgresDialect is case-insensitive. I think this is a bug. https://stackoverflow.com/questions/20878932/are-postgresql-column-names-case-sensitive https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS Could we delete PostgresDialect#quoteIdentifier, make it using super class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23263) LocalBufferPool can not request memory.
Ada Wong created FLINK-23263: Summary: LocalBufferPool can not request memory. Key: FLINK-23263 URL: https://issues.apache.org/jira/browse/FLINK-23263 Project: Flink Issue Type: Improvement Components: Runtime / Network Affects Versions: 1.10.1 Reporter: Ada Wong Flink job is running, bug it can not consume kafka data. This following is exception. "Map -> to: Tuple2 -> Map -> (from: (id, sid, item, val, unit, dt, after_index, tablename, PROCTIME) -> where: (AND(=(tablename, CONCAT(_UTF-16LE't_real', currtime2(dt, _UTF-16LE'MMdd'))), OR(=(after_index, _UTF-16LE'2.6'), AND(=(sid, _UTF-16LE'7296'), =(after_index, _UTF-16LE'2.10'))), LIKE(item, _UTF-16LE'??5%'), =(MOD(getminutes(dt), 5), 0))), select: (id AS ID, sid AS STID, item AS ITEM, val AS VAL, unit AS UNIT, dt AS DATATIME), from: (id, sid, item, val, unit, dt, after_index, tablename, PROCTIME) -> where: (AND(=(tablename, CONCAT(_UTF-16LE't_real', currtime2(dt, _UTF-16LE'MMdd'))), OR(=(after_index, _UTF-16LE'2.6'), AND(=(sid, _UTF-16LE'7296'), =(after_index, _UTF-16LE'2.10'))), LIKE(item, _UTF-16LE'??5%'), =(MOD(getminutes(dt), 5), 0))), select: (sid AS STID, val AS VAL, dt AS DATATIME)) (1/1)" #79 prio=5 os_prio=0 tid=0x7fd4c4a94000 nid=0x21c0 in Object.wait() [0x7fd4d5719000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment(LocalBufferPool.java:251) - locked <0x00074e6c8b98> (a java.util.ArrayDeque) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBuilderBlocking(LocalBufferPool.java:218) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.requestNewBufferBuilder(RecordWriter.java:264) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.copyFromSerializerToTargetChannel(RecordWriter.java:192) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:162) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128) at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107) at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89) at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696) at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51) at org.apache.flink.table.runtime.CRowWrappingCollector.collect(CRowWrappingCollector.scala:37) at org.apache.flink.table.runtime.CRowWrappingCollector.collect(CRowWrappingCollector.scala:28) at DataStreamCalcRule$4160.processElement(Unknown Source) at org.apache.flink.table.runtime.CRowProcessRunner.processElement(CRowProcessRunner.scala:66) at org.apache.flink.table.runtime.CRowProcessRunner.processElement(CRowProcessRunner.scala:35) at org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.pushToOperator(OperatorChain.java:488) at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:465) at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:425) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696) at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51) at org.apache.flink.table.runtime.CRowWrappingCollector.collect(CRowWrappingCollector.scala:37) at org.apache.flink.table.runtime.CRowWrappingCollector.collect(CRowWrappingCollector.scala:28) at DataStreamSourceConversion$4104.processElement(Unknown Source) at org.apache.flink.table.runtime.CRowOutputProcessRunner.processElement(CRowOutputProcessRunner.scala:70) at org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.pushToOperator(OperatorChain.java:488) at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:465) at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:425) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:686) at
[jira] [Created] (FLINK-23074) There is a class conflict between flink-connector-hive and flink-parquet
Ada Wong created FLINK-23074: Summary: There is a class conflict between flink-connector-hive and flink-parquet Key: FLINK-23074 URL: https://issues.apache.org/jira/browse/FLINK-23074 Project: Flink Issue Type: Improvement Components: Connectors / Hive Reporter: Ada Wong flink-connector-hive 2.3.6 include parquet-hadoop 1.8.1 version but flink-parquet include 1.11.1. org.apache.parquet.hadoop.example.GroupWriteSupport is different. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22776) Delete casting to byte[]
Ada Wong created FLINK-22776: Summary: Delete casting to byte[] Key: FLINK-22776 URL: https://issues.apache.org/jira/browse/FLINK-22776 Project: Flink Issue Type: Improvement Components: Connectors / JDBC Affects Versions: 1.13.0 Reporter: Ada Wong Attachments: image-2021-05-26-10-38-04-578.png Casting to 'byte[]' is redundant. Could we delete it? !image-2021-05-26-10-38-04-578.png|thumbnail! -- This message was sent by Atlassian Jira (v8.3.4#803005)