[jira] [Created] (FLINK-25684) Support enhanced show databases syntax

2022-01-17 Thread Moses (Jira)
Moses created FLINK-25684:
-

 Summary: Support enhanced show databases syntax
 Key: FLINK-25684
 URL: https://issues.apache.org/jira/browse/FLINK-25684
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: Moses


Enhanced `show databases` statement like ` show databasesfrom like 'db%' ` has 
been supported broadly in many popular SQL engine like Spark SQL/MySQL.

We could use such statement to easily show the databases that we wannted.
h3. SHOW DATABSES [ LIKE regex_pattern ]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25435) Can not read jobmanager/taskmanager.log in yarn-per-job mode.

2021-12-23 Thread Moses (Jira)
Moses created FLINK-25435:
-

 Summary: Can not read jobmanager/taskmanager.log in yarn-per-job 
mode.
 Key: FLINK-25435
 URL: https://issues.apache.org/jira/browse/FLINK-25435
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Reporter: Moses


I'm using SQL Client to submit a job, and using `SET` statement to specify 
deploy mode.
{code:sql}
SET execution.target=yarn-per-job;
...
{code}
But I can not found log files both on master and taskmanagers.

I found that `GenericCLI` and  `FlinkYarnSessionCli` will set 
`$internal.deployment.config-dir={configurationDirectory}` in their execution 
configuration.

Should we set this configuration in `DefaultCLI` as well?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25379) Support limit push down in DATAGEN connector.

2021-12-19 Thread Moses (Jira)
Moses created FLINK-25379:
-

 Summary: Support limit push down in DATAGEN connector.
 Key: FLINK-25379
 URL: https://issues.apache.org/jira/browse/FLINK-25379
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: Moses


I used datagen to generate data, and I found that the source is never ending.
{code:sql}
SET sql-client.execution.result-mode=TABLEAU;
CREATE TABLE datagen (a STRING) WITH ('connector'='datagen');
SELECT * FROM datagen LIMIT 10;
{code}

I think it's advisable for us to push limit down in datagen source.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24882) Make SQL format identifier NOT case-sensitive.

2021-11-11 Thread Moses (Jira)
Moses created FLINK-24882:
-

 Summary: Make SQL format identifier NOT case-sensitive.
 Key: FLINK-24882
 URL: https://issues.apache.org/jira/browse/FLINK-24882
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Client
Affects Versions: 1.13.2
Reporter: Moses


If I use upper-case format identifier like below:
{code:java}
CREATE TABLE t_table_1 (a INT) WITH (..., 'format' = 'JSON', ...)
{code}
Then I got `Could not find factory ...` exception.
{panel}
[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.table.api.ValidationException: Could not find any factory for 
identifier 'JSON' that implements 
'org.apache.flink.table.factories.DeserializationFormatFactory' in the 
classpath.

Available factory identifiers are:

avro
canal-json
csv
debezium-json
json
maxwell-json
raw
{panel}
And this can be fix with compare them ignoring case.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24052) Flink SQL reads S3 bucket data.

2021-08-30 Thread Moses (Jira)
Moses created FLINK-24052:
-

 Summary: Flink SQL reads S3 bucket data.
 Key: FLINK-24052
 URL: https://issues.apache.org/jira/browse/FLINK-24052
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Ecosystem
Reporter: Moses


I wanna use Flink SQL reads S3 bucket data. But now I found it ONLY supports 
absolute path, which means I can not read all content in the bucket. 
My SQL statements write as below:
{code:sql}
CREATE TABLE file_data (
a BIGINT, b STRING, c STRING, d DOUBLE, e BOOLEAN, f DATE, g STRING,h 
STRING,
i STRING, j STRING, k STRING, l STRING, m STRING, n STRING, o STRING, p 
FLOAT
) WITH (
'connector' = 'filesystem',
'path' = 's3a://my-bucket',
'format' = 'parquet'
);

SELECT COUNT(*) FROM file_data;
{code}
The exception info:


{code:java}
Caused by: java.lang.IllegalArgumentException: path must be absolute
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:68) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:60) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:56) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
at 
org.apache.hadoop.fs.s3a.s3guard.S3Guard.putAndReturn(S3Guard.java:149) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
{code}

Is there any solution to meet my requirement ?






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22685) Write data to hive table in batch mode throws FileNotFoundException.

2021-05-17 Thread Moses (Jira)
Moses created FLINK-22685:
-

 Summary: Write data to hive table in batch mode throws 
FileNotFoundException.
 Key: FLINK-22685
 URL: https://issues.apache.org/jira/browse/FLINK-22685
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
 Environment: Flink Based on Flink 1.11.1.
Reporter: Moses


h3. Scence
I wanna luanch a batch job to process hive table data and write the result to 
another table(*T1*), and my SQL statements is wriiten like below:
{code:sql}
-- use hive dialect
SET table.sql-dialect=hive;
-- insert into hive table
insert overwrite table T1
partition (p_day_id,p_file_id)
select distinct 
{code}
The job was success luanched, but it failed on *Sink* operator. On Flink UI 
page I saw all task state is `*FINISHED*`, but *the job failed and it restarted 
again*.
And I found exception information like below: (*The path was marksed*)
{code:}
java.lang.Exception: Failed to finalize execution on master
at 
org.apache.flink.runtime.executiongraph.ExecutionGraph.vertexFinished(ExecutionGraph.java:1291)
at 
org.apache.flink.runtime.executiongraph.ExecutionVertex.executionFinished(ExecutionVertex.java:870)
at 
org.apache.flink.runtime.executiongraph.Execution.markFinished(Execution.java:1125)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraph.updateStateInternal(ExecutionGraph.java:1491)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:1464)
at 
org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:497)
at 
org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:386)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:284)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199)
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.table.api.TableException: Exception in 
finalizeGlobal
at 
org.apache.flink.table.filesystem.FileSystemOutputFormat.finalizeGlobal(FileSystemOutputFormat.java:97)
at 
org.apache.flink.runtime.jobgraph.InputOutputFormatVertex.finalizeOnMaster(InputOutputFormatVertex.java:132)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraph.vertexFinished(ExecutionGraph.java:1286)
... 31 more
Caused by: java.io.FileNotFoundException: File 
//XX/XXX/.staging_1621244168369 does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:814)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:872)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:868)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:868)
at 
org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFil