from:"ShenDa \(Jira\)"

[jira] [Commented] (FLINK-19005) used metaspace grow on every execution

2020-08-28 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186353#comment-17186353
 ] 

ShenDa commented on FLINK-19005:


[~chesnay] 
Looks like it's hard to solve the problem.
Do we any plan or good idea to fix this? Or is there  any other place to 
discuss the problem?

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission, Runtime / Configuration, 
> Runtime / Coordination
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Chesnay Schepler
>Priority: Major
> Attachments: heap_dump_after_10_executions.zip, 
> heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz, 
> modified-jdbc-inputformat.png, origin-jdbc-inputformat.png
>
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  
> 
> === Summary ==
> 
> Case 1, reported by [~gestevez]:
> * Flink 1.11.1
> * Java 11
> * Maximum Metaspace size set to 512mb
> * Custom Batch job, submitted daily
> * Requires restart every 15 days after an OOM
>  Case 2, reported by [~Echo Lee]:
> * Flink 1.11.0
> * Java 11
> * G1GC
> * WordCount Batch job, submitted every second / every 5 minutes
> * eventually fails TaskExecutor with OOM
> Case 3, reported by [~DaDaShen]
> * Flink 1.11.0
> * Java 11
> * WordCount Batch job, submitted every 5 seconds
> * growing Metaspace, eventually OOM
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-19005) used metaspace grow on every execution

2020-08-27 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186221#comment-17186221
 ] 

ShenDa edited comment on FLINK-19005 at 8/28/20, 2:57 AM:
--

[~chesnay] 
Thanks for your detailed instruction. 
But I still think there's maybe something wrong in Flink. I find that the 
JdbcInputFormat & JdbcOutputFormat is key reason cause the Metaspace OOM, 
because the java.sql.DriverManager doesn't release the reference of the Driver. 
The DriverManager is loaded by java.internal.ClassLoader but the driver is 
loaded by ChildFisrtClassLoader, which means the ChildFirstClassLoader can't be 
garbage collected according analyzation of dump file.  
The following code is used by me to reproduce the issue and  I use 
org.postgresql.Driver as jdbc Driver.
{code:java}
public static void main(String[] args) throws Exception {
EnvironmentSettings envSettings = EnvironmentSettings.newInstance()
.useBlinkPlanner() !origin-jdbc-inputformat.png! 
.inBatchMode()
.build();
TableEnvironment tEnv = TableEnvironment.create(envSettings);

tEnv.executeSql(
"CREATE TABLE " + INPUT_TABLE + "(" +
"id BIGINT," +
"timestamp6_col TIMESTAMP(6)," +
"timestamp9_col TIMESTAMP(6)," +
"time_col TIME," +
"real_col FLOAT," +
"decimal_col DECIMAL(10, 4)" +
") WITH (" +
"  'connector.type'='jdbc'," +
"  'connector.url'='" + DB_URL + "'," +
"  'connector.table'='" + INPUT_TABLE + 
"'," +
"  'connector.USERNAME'='" + USERNAME + 
"'," +
"  'connector.PASSWORD'='" + PASSWORD + 
"'" +
")"
);

TableResult tableResult = tEnv.executeSql("SELECT timestamp6_col, 
decimal_col FROM " + INPUT_TABLE);
tableResult.collect();
}
{code}
And below diagram shows the Metaspace usage constantly growing up, and finally 
TaskManager will be offline.
 !origin-jdbc-inputformat.png! 


Additional, I try to fix this issue by appending the following code to the 
function closeInputFormat() which can finally trigger garbage collect in 
Metaspace.

{code:java}
try{
final Enumeration drivers = DriverManager.getDrivers();
while (drivers.hasMoreElements()) {
DriverManager.deregisterDriver(drivers.nextElement());
}
} catch (SQLException se) {
LOG.info("Inputformat couldn't be closed - " + se.getMessage());
}
{code}
The following diagram shows the usage of Metaspace will be decreased.
 !modified-jdbc-inputformat.png! 
So, do you think it's a flink problem, and should we create a new issue to fix.


was (Author: dadashen):
[~chesnay] 
Thanks for your detailed instruction. 
But I still think there's maybe something wrong in Flink. I find that the 
JdbcInputFormat & JdbcOutputFormat is key reason cause the Metaspace OOM, 
because the java.sql.DriverManager doesn't release the reference of the Driver. 
The DriverManager is loaded by java.internal.ClassLoader but the driver is 
loaded by ChildFisrtClassLoader, which means the ChildFirstClassLoader can't be 
garbage collected according analyzation of dump file.  
The following code is used by me to reproduce the issue and  I use 
org.postgresql.Driver as jdbc Driver.
{code:java}
public static void main(String[] args) throws Exception {
EnvironmentSettings envSettings = EnvironmentSettings.newInstance()
.useBlinkPlanner() !origin-jdbc-inputformat.png! 
.inBatchMode()
.build();
TableEnvironment tEnv = TableEnvironment.create(envSettings);

tEnv.executeSql(
"CREATE TABLE " + INPUT_TABLE + "(" +
"id BIGINT," +
"timestamp6_col TIMESTAMP(6)," +
"timestamp9_col TIMESTAMP(6)," +
"time_col TIME," +
"real_col FLOAT," +
"decimal_col DECIMAL(10, 4)" +
") WITH (" +
"  'connector.type'='jdbc'," +
"  'connector.url'='" + DB_URL + "'," +
"  'connector.table'='" + INPUT_TABLE + 
"'," +

[jira] [Commented] (FLINK-19005) used metaspace grow on every execution

2020-08-27 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186221#comment-17186221
 ] 

ShenDa commented on FLINK-19005:


[~chesnay] 
Thanks for your detailed instruction. 
But I still think there's maybe something wrong in Flink. I find that the 
JdbcInputFormat & JdbcOutputFormat is key reason cause the Metaspace OOM, 
because the java.sql.DriverManager doesn't release the reference of the Driver. 
The DriverManager is loaded by java.internal.ClassLoader but the driver is 
loaded by ChildFisrtClassLoader, which means the ChildFirstClassLoader can't be 
garbage collected according analyzation of dump file.  
The following code is used by me to reproduce the issue and  I use 
org.postgresql.Driver as jdbc Driver.
{code:java}
public static void main(String[] args) throws Exception {
EnvironmentSettings envSettings = EnvironmentSettings.newInstance()
.useBlinkPlanner() !origin-jdbc-inputformat.png! 
.inBatchMode()
.build();
TableEnvironment tEnv = TableEnvironment.create(envSettings);

tEnv.executeSql(
"CREATE TABLE " + INPUT_TABLE + "(" +
"id BIGINT," +
"timestamp6_col TIMESTAMP(6)," +
"timestamp9_col TIMESTAMP(6)," +
"time_col TIME," +
"real_col FLOAT," +
"decimal_col DECIMAL(10, 4)" +
") WITH (" +
"  'connector.type'='jdbc'," +
"  'connector.url'='" + DB_URL + "'," +
"  'connector.table'='" + INPUT_TABLE + 
"'," +
"  'connector.USERNAME'='" + USERNAME + 
"'," +
"  'connector.PASSWORD'='" + PASSWORD + 
"'" +
")"
);

TableResult tableResult = tEnv.executeSql("SELECT timestamp6_col, 
decimal_col FROM " + INPUT_TABLE);
tableResult.collect();
}
{code}
And below diagram shows the Metaspace usage constantly growing up, and finally 
TaskManager will be offline.
 !origin-jdbc-inputformat.png! 


Additional, I try to fix this issue by appending the following code to the 
function closeInputFormat() which can finally trigger garbage collect in 
Metaspace.

{code:java}
try{
final Enumeration drivers = DriverManager.getDrivers();
while (drivers.hasMoreElements()) {
DriverManager.deregisterDriver(drivers.nextElement());
}
} catch (SQLException se) {
LOG.info("Inputformat couldn't be closed - " + se.getMessage());
}
{code}
The following diagram shows the usage of Metaspace will be decreased.
 !modified-jdbc-inputformat.png! 

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission, Runtime / Configuration, 
> Runtime / Coordination
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Chesnay Schepler
>Priority: Major
> Attachments: heap_dump_after_10_executions.zip, 
> heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz, 
> modified-jdbc-inputformat.png, origin-jdbc-inputformat.png
>
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  
> 
> === Summary ==
> 
> Case 1, reported by [~gestevez]:
> * Flink 1.11.1
> * Java 11
> * Maximum Metaspace size set to 512mb
> * Custom Batch job, submitted daily
> * Requires restart every 15 days after an OOM
>  Case 2, reported by [~Echo Lee]:
> * Flink 1.11.0
> * Java 11
> * G1GC
> * WordCount Batch job, submitted every second / every 5 minutes
> * eventually fails TaskExecutor with OOM
> Case 3, reported by [~DaDaShen]
> * Flink 1.11.0
> * Java 11
> * WordCount Batch job, submitted every 5 seconds
> * growing Metaspace, eventually OOM
>  



--
This message was se

[jira] [Updated] (FLINK-19005) used metaspace grow on every execution

2020-08-27 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa updated FLINK-19005:
---
Attachment: modified-jdbc-inputformat.png

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission, Runtime / Configuration, 
> Runtime / Coordination
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Chesnay Schepler
>Priority: Major
> Attachments: heap_dump_after_10_executions.zip, 
> heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz, 
> modified-jdbc-inputformat.png, origin-jdbc-inputformat.png
>
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  
> 
> === Summary ==
> 
> Case 1, reported by [~gestevez]:
> * Flink 1.11.1
> * Java 11
> * Maximum Metaspace size set to 512mb
> * Custom Batch job, submitted daily
> * Requires restart every 15 days after an OOM
>  Case 2, reported by [~Echo Lee]:
> * Flink 1.11.0
> * Java 11
> * G1GC
> * WordCount Batch job, submitted every second / every 5 minutes
> * eventually fails TaskExecutor with OOM
> Case 3, reported by [~DaDaShen]
> * Flink 1.11.0
> * Java 11
> * WordCount Batch job, submitted every 5 seconds
> * growing Metaspace, eventually OOM
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19005) used metaspace grow on every execution

2020-08-27 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa updated FLINK-19005:
---
Attachment: origin-jdbc-inputformat.png

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission, Runtime / Configuration, 
> Runtime / Coordination
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Chesnay Schepler
>Priority: Major
> Attachments: heap_dump_after_10_executions.zip, 
> heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz, 
> origin-jdbc-inputformat.png
>
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  
> 
> === Summary ==
> 
> Case 1, reported by [~gestevez]:
> * Flink 1.11.1
> * Java 11
> * Maximum Metaspace size set to 512mb
> * Custom Batch job, submitted daily
> * Requires restart every 15 days after an OOM
>  Case 2, reported by [~Echo Lee]:
> * Flink 1.11.0
> * Java 11
> * G1GC
> * WordCount Batch job, submitted every second / every 5 minutes
> * eventually fails TaskExecutor with OOM
> Case 3, reported by [~DaDaShen]
> * Flink 1.11.0
> * Java 11
> * WordCount Batch job, submitted every 5 seconds
> * growing Metaspace, eventually OOM
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-19005) used metaspace grow on every execution

2020-08-25 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183928#comment-17183928
 ] 

ShenDa edited comment on FLINK-19005 at 8/25/20, 10:58 AM:
---

[~chesnay] I'm willing to know how you can draw a conclusion that the class 
leaking is caused by java.sql.DriverManager from the dump files. I'm still no 
thinking to locate the key problem. 
BTW, I tried several times to using wordcount job to reproduce metaspace OOM. 
But this time flink was running well and no metaspace OOM occurred, so It was 
my mistake.


was (Author: dadashen):
[~chesnay] I'm willing to know how you can analyze the class leaking is caused 
by java.sql.DriverManager from the dump files. I'm still no thinking to locate 
the key problem. 
BTW, I tried several times to using wordcount job to reproduce metaspace OOM. 
But this time flink was running well and no metaspace OOM occurred, so It was 
my mistake.

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission, Runtime / Configuration, 
> Runtime / Coordination
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Chesnay Schepler
>Priority: Major
> Attachments: heap_dump_after_10_executions.zip, 
> heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz
>
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  
> 
> === Summary ==
> 
> Case 1, reported by [~gestevez]:
> * Flink 1.11.1
> * Java 11
> * Maximum Metaspace size set to 512mb
> * Custom Batch job, submitted daily
> * Requires restart every 15 days after an OOM
>  Case 2, reported by [~Echo Lee]:
> * Flink 1.11.0
> * Java 11
> * G1GC
> * WordCount Batch job, submitted every second / every 5 minutes
> * eventually fails TaskExecutor with OOM
> Case 3, reported by [~DaDaShen]
> * Flink 1.11.0
> * Java 11
> * WordCount Batch job, submitted every 5 seconds
> * growing Metaspace, eventually OOM
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-19005) used metaspace grow on every execution

2020-08-25 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183928#comment-17183928
 ] 

ShenDa commented on FLINK-19005:


[~chesnay] I'm willing to know how you can analyze the class leaking is caused 
by java.sql.DriverManager from the dump files. I'm still no thinking to locate 
the key problem. 
BTW, I tried several times to using wordcount job to reproduce metaspace OOM. 
But this time flink was running well and no metaspace OOM occurred, so It was 
my mistake.

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission, Runtime / Configuration, 
> Runtime / Coordination
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Chesnay Schepler
>Priority: Major
> Attachments: heap_dump_after_10_executions.zip, 
> heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz
>
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  
> 
> === Summary ==
> 
> Case 1, reported by [~gestevez]:
> * Flink 1.11.1
> * Java 11
> * Maximum Metaspace size set to 512mb
> * Custom Batch job, submitted daily
> * Requires restart every 15 days after an OOM
>  Case 2, reported by [~Echo Lee]:
> * Flink 1.11.0
> * Java 11
> * G1GC
> * WordCount Batch job, submitted every second / every 5 minutes
> * eventually fails TaskExecutor with OOM
> Case 3, reported by [~DaDaShen]
> * Flink 1.11.0
> * Java 11
> * WordCount Batch job, submitted every 5 seconds
> * growing Metaspace, eventually OOM
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-19005) used metaspace grow on every execution

2020-08-21 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17181741#comment-17181741
 ] 

ShenDa commented on FLINK-19005:


[~chesnay] I didn't do this on a local cluster. I use a script to submit job 
for every 5 seconds on standalone cluster, so I don't know how many times 
execution will trigger the OOM.  it's a long time to occur by default Metaspace 
configuration. But you can observe the usage of metatspace, you'll find that 
the space never release and grows up continuously.

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet, Client / Job Submission
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Chesnay Schepler
>Priority: Major
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-19005) used metaspace grow on every execution

2020-08-21 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17181718#comment-17181718
 ] 

ShenDa commented on FLINK-19005:


[~chesnay] Yes, I use flink 1.11.0 with it's batch word count on jdk11 
environment.

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet, Client / Job Submission
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Matthias
>Priority: Major
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-19005) used metaspace grow on every execution

2020-08-21 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17181704#comment-17181704
 ] 

ShenDa edited comment on FLINK-19005 at 8/21/20, 8:16 AM:
--

[~mapohl] Hi Matthias,
I  meet the same problem as [~gestevez] does. To figure out Metaspace leak is 
not caused by my code, I specifically submit the word count job. And for every 
times, the Metaspace grows up and never release. No matter how large the 
Metaspace I configured, the Metaspace OOM will always  occur. 


was (Author: dadashen):
[~mapohl] Hi Matthias,
I  meet the same problem as [~gestevez] does. To figure out Metaspace leak is 
not caused by my code, I specifically submit the word count job. And for every 
times, the Metaspace grows up and never release until the OOM occurred.

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet, Client / Job Submission
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Matthias
>Priority: Major
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-19005) used metaspace grow on every execution

2020-08-21 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17181704#comment-17181704
 ] 

ShenDa commented on FLINK-19005:


[~mapohl] Hi Matthias,
I  meet the same problem as [~gestevez] does. To figure out Metaspace leak is 
not caused by my code, I specifically submit the word count job. And for every 
times, the Metaspace grows up and never release until the OOM occurred.

> used metaspace grow on every execution
> --
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet, Client / Job Submission
>Affects Versions: 1.11.1
>Reporter: Guillermo Sánchez
>Assignee: Matthias
>Priority: Major
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with 
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never 
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not 
> released ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-15649) Support mounting volumes

2020-07-24 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164259#comment-17164259
 ] 

ShenDa edited comment on FLINK-15649 at 7/24/20, 8:31 AM:
--

[~azagrebin]I agree with the routine that implemnting pod template first and go 
back to discuss how can we introduce this feature. I'll close the PR.


was (Author: dadashen):
[~azagrebin]I agree with the routine that implemnting pod template first and go 
back to discuss how can we introduce this feature. I'll close the PR.

> Support mounting volumes 
> -
>
> Key: FLINK-15649
> URL: https://issues.apache.org/jira/browse/FLINK-15649
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Priority: Major
>  Labels: pull-request-available
>
> Add support for mounting K8S volumes, including emptydir, hostpath, pv etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15649) Support mounting volumes

2020-07-24 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164259#comment-17164259
 ] 

ShenDa commented on FLINK-15649:


[~azagrebin]I agree with the routine that implemnting pod template first and go 
back to discuss how can we introduce this feature. I'll close the PR.

> Support mounting volumes 
> -
>
> Key: FLINK-15649
> URL: https://issues.apache.org/jira/browse/FLINK-15649
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Priority: Major
>  Labels: pull-request-available
>
> Add support for mounting K8S volumes, including emptydir, hostpath, pv etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-15649) Support mounting volumes

2020-07-14 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157327#comment-17157327
 ] 

ShenDa edited comment on FLINK-15649 at 7/14/20, 12:22 PM:
---

[~felixzheng]The pod-template and mounting volumes are two different feature. 
We can't remove the this ticket's feature cause the pod-template can do the 
same work. If so, the already exist features(mounting hadoop conf and mounting 
flink conf) can also be removed.


was (Author: dadashen):
[~felixzheng]The pod-template and mounting volumes are two different feature. 
We can't remove the this ticket's feature cause the pod-template can do the 
same work. If so, the already exist features(mounting hadoop conf and mounting 
flink conf) can't also be removed.

> Support mounting volumes 
> -
>
> Key: FLINK-15649
> URL: https://issues.apache.org/jira/browse/FLINK-15649
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Priority: Major
>
> Add support for mounting K8S volumes, including emptydir, hostpath, pv etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15649) Support mounting volumes

2020-07-14 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157327#comment-17157327
 ] 

ShenDa commented on FLINK-15649:


[~felixzheng]The pod-template and mounting volumes are two different feature. 
We can't remove the this ticket's feature cause the pod-template can do the 
same work. If so, the already exist features(mounting hadoop conf and mounting 
flink conf) can't also be removed.

> Support mounting volumes 
> -
>
> Key: FLINK-15649
> URL: https://issues.apache.org/jira/browse/FLINK-15649
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Priority: Major
>
> Add support for mounting K8S volumes, including emptydir, hostpath, pv etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15649) Support mounting volumes

2020-07-14 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157309#comment-17157309
 ] 

ShenDa commented on FLINK-15649:


[~felixzheng]I don't think we can replace this feature with pod template, 
because users may just wish for mounting volume without configure any other pod 
specification. And I also meet  the trouble that introduced complicated config 
options for volume mounting  when I was doing this work, but I think it's a not 
a big problem.

BTW, could you illustrate the config options you have designed in your internal 
branch, and we can discuss for this trouble to reach an agreement.

> Support mounting volumes 
> -
>
> Key: FLINK-15649
> URL: https://issues.apache.org/jira/browse/FLINK-15649
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Priority: Major
>
> Add support for mounting K8S volumes, including emptydir, hostpath, pv etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15649) Support mounting volumes

2020-07-14 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157287#comment-17157287
 ] 

ShenDa commented on FLINK-15649:


[~tison] I have done some job for this feature until now, could you assign this 
ticket to me to finish this work.

> Support mounting volumes 
> -
>
> Key: FLINK-15649
> URL: https://issues.apache.org/jira/browse/FLINK-15649
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Priority: Major
>
> Add support for mounting K8S volumes, including emptydir, hostpath, pv etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15649) Support mounting volumes

2020-07-12 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17156499#comment-17156499
 ] 

ShenDa commented on FLINK-15649:


[~felixzheng], Do you have any plan to do this work?

> Support mounting volumes 
> -
>
> Key: FLINK-15649
> URL: https://issues.apache.org/jira/browse/FLINK-15649
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Canbin Zheng
>Priority: Major
>
> Add support for mounting K8S volumes, including emptydir, hostpath, pv etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (FLINK-18244) Support setup customized system environment before submitting test job

2020-06-21 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa resolved FLINK-18244.

Resolution: Not A Problem

> Support setup customized system environment before submitting test job
> --
>
> Key: FLINK-18244
> URL: https://issues.apache.org/jira/browse/FLINK-18244
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: ShenDa
>Priority: Major
>
> The new approach to implement e2e test suggests developer to use 
> FlinkDistribution to submit test job. But at present, we can't specify system 
> environment by invoking submitSqlJob() or submitJob(). This result in that 
> some connectors can not work if needful system environment not setup, suck 
> like hbase connector needs HADOOP_CLASSPATH.
> So I think we can do the work below:
> 1）Add a new method in AutoClosableProcess and it's builder class for putting 
> specified system environment.
> 2）Add a new interface that just used to configure system environment and let 
> class SQLJobSubmission and JobSubmission extends this interface.
> 3) Modify the methods, submitJob() and submitSQLJob(), in FlinkDistribution 
> to setup system environment before invoking runBlocking() or runNonBlocking()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18244) Support setup customized system environment before submitting test job

2020-06-10 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa updated FLINK-18244:
---
Description: 
The new approach to implement e2e test suggests developer to use 
FlinkDistribution to submit test job. But at present, we can't specify system 
environment by invoking submitSqlJob() or submitJob(). This result in that some 
connectors can not work if needful system environment not setup, suck like 
hbase connector needs HADOOP_CLASSPATH.
So I think we can do the work below:
1）Add a new method in AutoClosableProcess and it's builder class for putting 
specified system environment.
2）Add a new interface that just used to configure system environment and let 
class SQLJobSubmission and JobSubmission extends this interface.
3) Modify the methods, submitJob() and submitSQLJob(), in FlinkDistribution to 
setup system environment before invoking runBlocking() or runNonBlocking()

  was:
The new approach to implement e2e test suggests developer to use 
FlinkDistribution to submit test job. But at present, we can't specify system 
environment by invoking submitSqlJob() or submitJob(). This result in that some 
connectors can not work if needful system environment not setup, suck like 
hbase connector needs HADOOP_CLASSPATH.
So I think we can do the work below:
1）Add a new method in AutoClosableProcess and it's builder class for putting 
specified system environment.
2）Add a new interface that just used to configure system environment and let 
class SQLJobSubmission and JobSubmission extends this interface.
3) Modify the methods, submitJob() and submitSQLJob(), in FlinkDistribution to 
setup system environment before involing runBlocking() or runNonBlocking()


> Support setup customized system environment before submitting test job
> --
>
> Key: FLINK-18244
> URL: https://issues.apache.org/jira/browse/FLINK-18244
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: ShenDa
>Priority: Major
>
> The new approach to implement e2e test suggests developer to use 
> FlinkDistribution to submit test job. But at present, we can't specify system 
> environment by invoking submitSqlJob() or submitJob(). This result in that 
> some connectors can not work if needful system environment not setup, suck 
> like hbase connector needs HADOOP_CLASSPATH.
> So I think we can do the work below:
> 1）Add a new method in AutoClosableProcess and it's builder class for putting 
> specified system environment.
> 2）Add a new interface that just used to configure system environment and let 
> class SQLJobSubmission and JobSubmission extends this interface.
> 3) Modify the methods, submitJob() and submitSQLJob(), in FlinkDistribution 
> to setup system environment before invoking runBlocking() or runNonBlocking()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-18244) Support setup customized system environment before submitting test job

2020-06-10 Thread ShenDa (Jira)

ShenDa created FLINK-18244:
--

 Summary: Support setup customized system environment before 
submitting test job
 Key: FLINK-18244
 URL: https://issues.apache.org/jira/browse/FLINK-18244
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Reporter: ShenDa


The new approach to implement e2e test suggests developer to use 
FlinkDistribution to submit test job. But at present, we can't specify system 
environment by invoking submitSqlJob() or submitJob(). This result in that some 
connectors can not work if needful system environment not setup, suck like 
hbase connector needs HADOOP_CLASSPATH.
So I think we can do the work below:
1）Add a new method in AutoClosableProcess and it's builder class for putting 
specified system environment.
2）Add a new interface that just used to configure system environment and let 
class SQLJobSubmission and JobSubmission extends this interface.
3) Modify the methods, submitJob() and submitSQLJob(), in FlinkDistribution to 
setup system environment before involing runBlocking() or runNonBlocking()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17678) Create flink-sql-connector-hbase module to shade HBase

2020-05-31 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120672#comment-17120672
 ] 

ShenDa commented on FLINK-17678:


Ok, I'm going to use another way to implement hbase e2e test.

> Create flink-sql-connector-hbase module to shade HBase
> --
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Assignee: ShenDa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-30 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120392#comment-17120392
 ] 

ShenDa commented on FLINK-17678:


[~jark] Because the logic in test_sql_client.sh constrains a shade jar only can 
contain four name form directory or file. For any class in shade jar, the only 
name  for them is starting with org.apache.flink. But to avoid throwing 
exception by hbase region server, these classes org.apache.hadoop.hbase.codec.* 
can't be shaded. This result in there are some class not starting with 
org.apache.flink, so the jar I shaded can't pass the verification in 
test_sql_client.sh.

And thanks for your suggestion, I'll try using Java to implement hbase e2e test 
instead of shell scripts.

> Support fink-sql-connector-hbase
> 
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Assignee: ShenDa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-29 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119427#comment-17119427
 ] 

ShenDa edited comment on FLINK-17678 at 5/29/20, 9:23 AM:
--

I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:sh}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data.  So I shade the hbase dependencies by blow 
configuration:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem? Could I modify the test_sql_client.sh script?


was (Author: dadashen):
I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:sh}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data.  So I shade the hbase dependencies by blow 
configuration:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?

> Support fink-sql-connector-hbase
> 
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Assignee: ShenDa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-29 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119427#comment-17119427
 ] 

ShenDa edited comment on FLINK-17678 at 5/29/20, 9:20 AM:
--

I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:sh}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data.  So I shade the hbase dependencies by blow 
configuration:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?


was (Author: dadashen):
I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:sh}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data.  So we shade the hbase dependencies blow:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?

> Support fink-sql-connector-hbase
> 
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Assignee: ShenDa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-29 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119427#comment-17119427
 ] 

ShenDa edited comment on FLINK-17678 at 5/29/20, 9:20 AM:
--

I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:sh}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data.  So we shade the hbase dependencies blow:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?


was (Author: dadashen):
I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:sh}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data(byte[]).  So we shade the hbase 
dependencies blow:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?

> Support fink-sql-connector-hbase
> 
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Assignee: ShenDa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-29 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119427#comment-17119427
 ] 

ShenDa edited comment on FLINK-17678 at 5/29/20, 9:19 AM:
--

I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:sh}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data(byte[]).  So we shade the hbase 
dependencies blow:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?


was (Author: dadashen):
I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:shell}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data(byte[]).  So we shade the hbase 
dependencies blow:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?

> Support fink-sql-connector-hbase
> 
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Assignee: ShenDa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-29 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119427#comment-17119427
 ] 

ShenDa edited comment on FLINK-17678 at 5/29/20, 9:19 AM:
--

I encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:shell}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data(byte[]).  So we shade the hbase 
dependencies blow:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?


was (Author: dadashen):
We encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:shell}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data(byte[]).  So we shade the hbase 
dependencies blow:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?

> Support fink-sql-connector-hbase
> 
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Assignee: ShenDa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-29 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119427#comment-17119427
 ] 

ShenDa commented on FLINK-17678:


We encountered an problem that the hbase shade jar we built can't pass the 
verification in test_sql_client.sh :
{code:shell}
if ! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/META-INF"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/LICENSE"* ]] && \
! [[ $EXTRACTED_FILE = "$EXTRACTED_JAR/NOTICE"* ]] ; then
  echo "Bad file in JAR: $EXTRACTED_FILE"
  exit 1
fi
{code}
To avoiding exception thrown by hbase region server, we didn't shade the 
org.apache.hadoop.hbase.codec.*, because the hbase server can not find the 
shaded codec class to decoding data(byte[]).  So we shade the hbase 
dependencies blow:
{code:xml}

org.apache.hbase

org.apache.flink.hbase.shaded.org.apache.hbase

  org.apache.hadoop.hbase.codec.*


{code}
But by this way, the shaded hbase jar contains a directory named 
org/apache/hadoop/hbase. This directory does not obey the rule * ! [[ 
$EXTRACTED_FILE = "$EXTRACTED_JAR/org/apache/flink"* ]] *.
So how can I solve this problem?

> Support fink-sql-connector-hbase
> 
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Assignee: ShenDa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17939) Translate "Python Table API Installation" page into Chinese

2020-05-26 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116695#comment-17116695
 ] 

ShenDa commented on FLINK-17939:


Hi, [~jark] I‘m interested in this issue, could you assignee to me to translate 
this page into Chinese

> Translate "Python Table API Installation" page into Chinese
> ---
>
> Key: FLINK-17939
> URL: https://issues.apache.org/jira/browse/FLINK-17939
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, chinese-translation, Documentation
>Reporter: Jark Wu
>Priority: Major
>
> The page url is 
> https://ci.apache.org/projects/flink/flink-docs-master/zh/dev/table/python/installation.html
> The markdown file is located in flink/docs/dev/table/python/installation.zh.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14359) Create a module called flink-sql-connector-hbase to shade HBase

2020-05-25 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116492#comment-17116492
 ] 

ShenDa commented on FLINK-14359:


[~openinx] Ok.

> Create a module called flink-sql-connector-hbase to shade HBase
> ---
>
> Key: FLINK-14359
> URL: https://issues.apache.org/jira/browse/FLINK-14359
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / HBase
>Reporter: Jingsong Lee
>Assignee: Zheng Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need do the same thing as kafka and elasticsearch to HBase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-14359) Create a module called flink-sql-connector-hbase to shade HBase

2020-05-25 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116480#comment-17116480
 ] 

ShenDa edited comment on FLINK-14359 at 5/26/20, 6:35 AM:
--

Hi  [~openinx], I'am now working on trying to shade hbase connector. 
Coincidentally, I find you have already done the most job of this, but there is 
no more work after your PR and there still some conflicts to merge your code. 
So do you have any plan to finish this work?


was (Author: dadashen):
Hi  [~openinx], I'am now working on trying to shade hbase connector. 
Fortunately, I find you have already done the most job of this, but there is no 
more work after your PR and there still some conflicts to merge your code. So 
do you have any plan to finish this work?

> Create a module called flink-sql-connector-hbase to shade HBase
> ---
>
> Key: FLINK-14359
> URL: https://issues.apache.org/jira/browse/FLINK-14359
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / HBase
>Reporter: Jingsong Lee
>Assignee: Zheng Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need do the same thing as kafka and elasticsearch to HBase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14359) Create a module called flink-sql-connector-hbase to shade HBase

2020-05-25 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116480#comment-17116480
 ] 

ShenDa commented on FLINK-14359:


Hi  [~openinx], I'am now working on trying to shade hbase connector. 
Fortunately, I find you have already done the most job of this, but there is no 
more work after your PR and there still some conflicts to merge your code. So 
do you have any plan to finish this work?

> Create a module called flink-sql-connector-hbase to shade HBase
> ---
>
> Key: FLINK-14359
> URL: https://issues.apache.org/jira/browse/FLINK-14359
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / HBase
>Reporter: Jingsong Lee
>Assignee: Zheng Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need do the same thing as kafka and elasticsearch to HBase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-13 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106941#comment-17106941
 ] 

ShenDa commented on FLINK-17678:


[~jark] Yes, may be it's not easy to shade, but I have already implement a 
simply shade uber jar with weak test. I can try to make this robust, could this 
issue assign to me.

> Support fink-sql-connector-hbase
> 
>
> Key: FLINK-17678
> URL: https://issues.apache.org/jira/browse/FLINK-17678
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: ShenDa
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, flink doesn't contains a hbase uber jar, so users have to add 
> hbase dependency manually.
> Could I create new module called flink-sql-connector-hbase like elasticsaerch 
> and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17678) Support fink-sql-connector-hbase

2020-05-13 Thread ShenDa (Jira)

ShenDa created FLINK-17678:
--

 Summary: Support fink-sql-connector-hbase
 Key: FLINK-17678
 URL: https://issues.apache.org/jira/browse/FLINK-17678
 Project: Flink
  Issue Type: Improvement
Reporter: ShenDa


Currently, flink doesn't contains a hbase uber jar, so users have to add hbase 
dependency manually.
Could I create new module called flink-sql-connector-hbase like elasticsaerch 
and kafka sql -connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16713) Support source mode of elasticsearch connector

2020-05-13 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106823#comment-17106823
 ] 

ShenDa commented on FLINK-16713:


[~jark]，I'm colleague  of [~jackylau]，Our company truly has the scenario of 
reading elasticsearch like other companies do. As what your said above, we can 
start from implementing a bounded es source.So is it possible for us to draft a 
new design doc to discuss details?

> Support source mode of elasticsearch connector
> --
>
> Key: FLINK-16713
> URL: https://issues.apache.org/jira/browse/FLINK-16713
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> [https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#elasticsearch-connector]
> For append-only queries, the connector can also operate in [append 
> mode|https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#update-modes]
>  for exchanging only INSERT messages with the external system. If no key is 
> defined by the query, a key is automatically generated by Elasticsearch.
> I want to know ,why the connector of flink with ES just support sink but  
> doesn't support source .Which version could add this feature to ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (FLINK-14868) Provides the ability for multiple sinks to write data serially

2020-05-13 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa resolved FLINK-14868.

Resolution: Resolved

We implements a TableSink which wraps other sinks that needed to be written 
serially

> Provides the ability for multiple sinks to write data serially
> --
>
> Key: FLINK-14868
> URL: https://issues.apache.org/jira/browse/FLINK-14868
> Project: Flink
>  Issue Type: Wish
>  Components: API / DataStream, Table SQL / Runtime
>Affects Versions: 1.9.1
>Reporter: ShenDa
>Priority: Major
>
> At present, Flink can use multiple sinks to write data into different data 
> source such as HBase,Kafka,Elasticsearch,etc.And this process is concurrent 
> ,in other words, one record will be written into data sources simultaneously.
> But there is no approach that can sinking data serially.We really wish Flink 
> can providing this kind of ability that a sink can write data into target 
> database only after the previous sink transfers data successfully.And if the 
> previous sink encounters any exception, the next sink will not work.
> h1.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (FLINK-15182) Add FileTableSink based on StreamingFileSink

2020-05-13 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa resolved FLINK-15182.

Resolution: Resolved

FLIP-115 is discussing this issue

> Add FileTableSink based on StreamingFileSink
> 
>
> Key: FLINK-15182
> URL: https://issues.apache.org/jira/browse/FLINK-15182
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: ShenDa
>Priority: Major
>
> Flink already has FileSystem connector that can sink data into specified file 
> system, but it just writes data to a target path without bukcet and it can't 
> configure file rolling policy neither.  We can use StreamingFileSink to avoid 
> those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
> that users have to implement their processing logic with DataStreamAPI. 
> Therefore, we want to add a new FileTableSink based on StreamingFileSink. 
> This will help us connect file system by using TableAPI&SQL. But according to 
> the two ways of StreamingFileSink to sinking data, bulk format and row 
> format, it comes a question to us. Does the FileTableSink have to support 
> bulk format and row format simultaneously? Or separating this sink into two 
> table sinks, BulkFileTableSink and RowFileTableSink, which write data by 
> using one format respectively?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-14868) Provides the ability for multiple sinks to write data serially

2019-12-12 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995265#comment-16995265
 ] 

ShenDa edited comment on FLINK-14868 at 12/13/19 1:32 AM:
--

[~rmetzger] Thanks for your reply, it's a nice a idea to implement a custom 
sink that wrap series sinks to write data together.  But, if  I want to use 
TableAPI&SQL to sink data into different targets by order, is there any good 
way to achieve this?


was (Author: dadashen):
[~rmetzger] Thanks for your reply, it's a nice a idea to implement a custom 
sink that wrap series sinks to write data together.  But, if  I want to use 
TableAPI&SQL to sink data into different targets by order, is there any good 
way to archive this?

> Provides the ability for multiple sinks to write data serially
> --
>
> Key: FLINK-14868
> URL: https://issues.apache.org/jira/browse/FLINK-14868
> Project: Flink
>  Issue Type: Wish
>  Components: API / DataStream, Table SQL / Runtime
>Affects Versions: 1.9.1
>Reporter: ShenDa
>Priority: Major
>
> At present, Flink can use multiple sinks to write data into different data 
> source such as HBase,Kafka,Elasticsearch,etc.And this process is concurrent 
> ,in other words, one record will be written into data sources simultaneously.
> But there is no approach that can sinking data serially.We really wish Flink 
> can providing this kind of ability that a sink can write data into target 
> database only after the previous sink transfers data successfully.And if the 
> previous sink encounters any exception, the next sink will not work.
> h1.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14868) Provides the ability for multiple sinks to write data serially

2019-12-12 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995265#comment-16995265
 ] 

ShenDa commented on FLINK-14868:


[~rmetzger] Thanks for your reply, it's a nice a idea to implement a custom 
sink that wrap series sinks to write data together.  But, if  I want to use 
TableAPI&SQL to sink data into different targets by order, is there any good 
way to archive this?

> Provides the ability for multiple sinks to write data serially
> --
>
> Key: FLINK-14868
> URL: https://issues.apache.org/jira/browse/FLINK-14868
> Project: Flink
>  Issue Type: Wish
>  Components: API / DataStream, Table SQL / Runtime
>Affects Versions: 1.9.1
>Reporter: ShenDa
>Priority: Major
>
> At present, Flink can use multiple sinks to write data into different data 
> source such as HBase,Kafka,Elasticsearch,etc.And this process is concurrent 
> ,in other words, one record will be written into data sources simultaneously.
> But there is no approach that can sinking data serially.We really wish Flink 
> can providing this kind of ability that a sink can write data into target 
> database only after the previous sink transfers data successfully.And if the 
> previous sink encounters any exception, the next sink will not work.
> h1.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15182) Add FileTableSink based on StreamingFileSink

2019-12-12 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994575#comment-16994575
 ] 

ShenDa commented on FLINK-15182:


[~jark] it's very glad to see the intention of implementing a unified 
FileSystem TableSink.
But, we also find the code in filesystem connector is deprecated, and the 
flink-table module contains CsvTableSink&CsvTableSource. So we think of the 
current hierarchical structure of FileSystem Connector needs to be refactored.

Besides, may be, the CSV should be kinds of format like json, but not be a 
individual TableSource or TableSink.

> Add FileTableSink based on StreamingFileSink
> 
>
> Key: FLINK-15182
> URL: https://issues.apache.org/jira/browse/FLINK-15182
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: ShenDa
>Priority: Major
>
> Flink already has FileSystem connector that can sink data into specified file 
> system, but it just writes data to a target path without bukcet and it can't 
> configure file rolling policy neither.  We can use StreamingFileSink to avoid 
> those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
> that users have to implement their processing logic with DataStreamAPI. 
> Therefore, we want to add a new FileTableSink based on StreamingFileSink. 
> This will help us connect file system by using TableAPI&SQL. But according to 
> the two ways of StreamingFileSink to sinking data, bulk format and row 
> format, it comes a question to us. Does the FileTableSink have to support 
> bulk format and row format simultaneously? Or separating this sink into two 
> table sinks, BulkFileTableSink and RowFileTableSink, which write data by 
> using one format respectively?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15182) Add FileTableSink based on StreamingFileSink

2019-12-10 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa updated FLINK-15182:
---
Description: 
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink have to support bulk format and 
row format simultaneously? Or separating this sink into two table sinks, 
BulkFileTableSink and RowFileTableSink, which write data by using one format 
respectively?

  was:
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink have to support bulk format and 
row format simultaneously? Or separating this sink into two table sinks, 
BulkFileTableSink and RowFileTableSink, which writes data by using one format 
respectively?


> Add FileTableSink based on StreamingFileSink
> 
>
> Key: FLINK-15182
> URL: https://issues.apache.org/jira/browse/FLINK-15182
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.9.0
>Reporter: ShenDa
>Priority: Major
>
> Flink already has FileSystem connector that can sink data into specified file 
> system, but it just writes data to a target path without bukcet and it can't 
> configure file rolling policy neither.  We can use StreamingFileSink to avoid 
> those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
> that users have to implement their processing logic with DataStreamAPI. 
> Therefore, we want to add a new FileTableSink based on StreamingFileSink. 
> This will help us connect file system by using TableAPI&SQL. But according to 
> the two ways of StreamingFileSink to sinking data, bulk format and row 
> format, it comes a question to us. Does the FileTableSink have to support 
> bulk format and row format simultaneously? Or separating this sink into two 
> table sinks, BulkFileTableSink and RowFileTableSink, which write data by 
> using one format respectively?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15182) Add FileTableSink based on StreamingFileSink

2019-12-10 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa updated FLINK-15182:
---
Description: 
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink have to support bulk format and 
row format simultaneously? Or separating this sink into two table sinks, 
BulkFileTableSink and RowFileTableSink, which writes data by using one format 
respectively?

  was:
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink have to support bulk format and 
row format simultaneously? Or separating this sink into two table sinks, 
BulkFileTableSink and RowFileTableSink, which writes data by using one format 
respectively.


> Add FileTableSink based on StreamingFileSink
> 
>
> Key: FLINK-15182
> URL: https://issues.apache.org/jira/browse/FLINK-15182
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.9.0
>Reporter: ShenDa
>Priority: Major
>
> Flink already has FileSystem connector that can sink data into specified file 
> system, but it just writes data to a target path without bukcet and it can't 
> configure file rolling policy neither.  We can use StreamingFileSink to avoid 
> those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
> that users have to implement their processing logic with DataStreamAPI. 
> Therefore, we want to add a new FileTableSink based on StreamingFileSink. 
> This will help us connect file system by using TableAPI&SQL. But according to 
> the two ways of StreamingFileSink to sinking data, bulk format and row 
> format, it comes a question to us. Does the FileTableSink have to support 
> bulk format and row format simultaneously? Or separating this sink into two 
> table sinks, BulkFileTableSink and RowFileTableSink, which writes data by 
> using one format respectively?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15182) Add FileTableSink based on StreamingFileSink

2019-12-10 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa updated FLINK-15182:
---
Description: 
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink have to support bulk format and 
row format simultaneously? Or separating this sink into two table sinks, 
BulkFileTableSink and RowFileTableSink, which writes data by using one format 
respectively.

  was:
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink have to support bulk format and 
row format simultaneously? Or separating this sink into two sinks, 
BulkFileTableSink and RowFileTableSink, which writes data by using one format 
respectively.


> Add FileTableSink based on StreamingFileSink
> 
>
> Key: FLINK-15182
> URL: https://issues.apache.org/jira/browse/FLINK-15182
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.9.0
>Reporter: ShenDa
>Priority: Major
>
> Flink already has FileSystem connector that can sink data into specified file 
> system, but it just writes data to a target path without bukcet and it can't 
> configure file rolling policy neither.  We can use StreamingFileSink to avoid 
> those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
> that users have to implement their processing logic with DataStreamAPI. 
> Therefore, we want to add a new FileTableSink based on StreamingFileSink. 
> This will help us connect file system by using TableAPI&SQL. But according to 
> the two ways of StreamingFileSink to sinking data, bulk format and row 
> format, it comes a question to us. Does the FileTableSink have to support 
> bulk format and row format simultaneously? Or separating this sink into two 
> table sinks, BulkFileTableSink and RowFileTableSink, which writes data by 
> using one format respectively.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15182) Add FileTableSink based on StreamingFileSink

2019-12-10 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa updated FLINK-15182:
---
Description: 
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink have to support bulk format and 
row format simultaneously? Or separating this sink into two sinks, 
BulkFileTableSink and RowFileTableSink, which writes data by using one format 
respectively.

  was:
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink support bulk format and row 
format simultaneously? Or separating this sink into two sinks, 
BulkFileTableSink and RowFileTableSink, which writes data by using one format 
respectively.


> Add FileTableSink based on StreamingFileSink
> 
>
> Key: FLINK-15182
> URL: https://issues.apache.org/jira/browse/FLINK-15182
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.9.0
>Reporter: ShenDa
>Priority: Major
>
> Flink already has FileSystem connector that can sink data into specified file 
> system, but it just writes data to a target path without bukcet and it can't 
> configure file rolling policy neither.  We can use StreamingFileSink to avoid 
> those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
> that users have to implement their processing logic with DataStreamAPI. 
> Therefore, we want to add a new FileTableSink based on StreamingFileSink. 
> This will help us connect file system by using TableAPI&SQL. But according to 
> the two ways of StreamingFileSink to sinking data, bulk format and row 
> format, it comes a question to us. Does the FileTableSink have to support 
> bulk format and row format simultaneously? Or separating this sink into two 
> sinks, BulkFileTableSink and RowFileTableSink, which writes data by using one 
> format respectively.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15182) Add FileTableSink based on StreamingFileSink

2019-12-10 Thread ShenDa (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShenDa updated FLINK-15182:
---
Description: 
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according to the 
two ways of StreamingFileSink to sinking data, bulk format and row format, it 
comes a question to us. Does the FileTableSink support bulk format and row 
format simultaneously? Or separating this sink into two sinks, 
BulkFileTableSink and RowFileTableSink, which writes data by using one format 
respectively.

  was:
Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according the two 
ways of StreamingFileSink to sinking data, bulk format and row format, it comes 
a question to us. Does the FileTableSink support bulk format and row format 
simultaneously? Or separating this sink into two sinks, BulkFileTableSink and 
RowFileTableSink, which writes data by using one format respectively.


> Add FileTableSink based on StreamingFileSink
> 
>
> Key: FLINK-15182
> URL: https://issues.apache.org/jira/browse/FLINK-15182
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.9.0
>Reporter: ShenDa
>Priority: Major
>
> Flink already has FileSystem connector that can sink data into specified file 
> system, but it just writes data to a target path without bukcet and it can't 
> configure file rolling policy neither.  We can use StreamingFileSink to avoid 
> those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
> that users have to implement their processing logic with DataStreamAPI. 
> Therefore, we want to add a new FileTableSink based on StreamingFileSink. 
> This will help us connect file system by using TableAPI&SQL. But according to 
> the two ways of StreamingFileSink to sinking data, bulk format and row 
> format, it comes a question to us. Does the FileTableSink support bulk format 
> and row format simultaneously? Or separating this sink into two sinks, 
> BulkFileTableSink and RowFileTableSink, which writes data by using one format 
> respectively.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15182) Add FileTableSink based on StreamingFileSink

2019-12-10 Thread ShenDa (Jira)

ShenDa created FLINK-15182:
--

 Summary: Add FileTableSink based on StreamingFileSink
 Key: FLINK-15182
 URL: https://issues.apache.org/jira/browse/FLINK-15182
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / FileSystem
Affects Versions: 1.9.0
Reporter: ShenDa


Flink already has FileSystem connector that can sink data into specified file 
system, but it just writes data to a target path without bukcet and it can't 
configure file rolling policy neither.  We can use StreamingFileSink to avoid 
those shortcomings instead,  however, the StreamingFileSink is a SinkFunction 
that users have to implement their processing logic with DataStreamAPI. 

Therefore, we want to add a new FileTableSink based on StreamingFileSink. This 
will help us connect file system by using TableAPI&SQL. But according the two 
ways of StreamingFileSink to sinking data, bulk format and row format, it comes 
a question to us. Does the FileTableSink support bulk format and row format 
simultaneously? Or separating this sink into two sinks, BulkFileTableSink and 
RowFileTableSink, which writes data by using one format respectively.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-12584) Add Bucket File Syetem Table Sink

2019-12-03 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-12584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987569#comment-16987569
 ] 

ShenDa commented on FLINK-12584:


[~zhangjun] Is there any progress of this issue? Our group also need a bucket 
fs table sink.

> Add Bucket File Syetem Table Sink
> -
>
> Key: FLINK-12584
> URL: https://issues.apache.org/jira/browse/FLINK-12584
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.8.0, 1.9.0
>Reporter: Jun Zhang
>Assignee: Jun Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> 1. *Motivation*
> In flink, the file system (especially hdfs) is a very common output, but for 
> users using sql, it does not support directly using sql to write data to the 
> file system, so I want to add a bucket file system table sink, the user can 
> register it to StreamTableEnvironment, so that table api and sql api can 
> directly use the sink to write stream data to filesystem
> 2.*example*
> tEnv.connect(new Bucket().basePath("hdfs://localhost/tmp/flink-data"))
>    .withFormat(new Json().deriveSchema())
>    .withSchema(new Schema()
>   .field("name", Types. STRING ())
>   .field("age", Types. INT ())
>    .inAppendMode()
>    .registerTableSink("myhdfssink");
> tEnv.sqlUpdate("insert into myhdfssink SELECT * FROM mytablesource");
>  
>  3.*Some ideas to achieve this function*
> 1) Add a class Bucket which extends from ConnectorDescriptor, add some 
> properties, such as basePath.
> 2) Add a class BucketValidator which extends from the 
> ConnectorDescriptorValidator and is used to check the bucket descriptor.
> 3) Add a class FileSystemTableSink to implement the StreamTableSink 
> interface.  In the emitDataStream method, construct StreamingFileSink for 
> writing data to filesystem according to different properties.
> 4) Add a factory class FileSystemTableSinkFactory to implement the 
> StreamTableSinkFactory interface for constructing FileSystemTableSink
> 5) The parameters of withFormat method is the implementation classes of the 
> FormatDescriptor interface, such as Json, Csv, and we can add Parquet、Orc 
> later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-14868) Provides the ability for multiple sinks to write data serially

2019-11-19 Thread ShenDa (Jira)

ShenDa created FLINK-14868:
--

 Summary: Provides the ability for multiple sinks to write data 
serially
 Key: FLINK-14868
 URL: https://issues.apache.org/jira/browse/FLINK-14868
 Project: Flink
  Issue Type: Wish
  Components: API / DataStream, Table SQL / Runtime
Affects Versions: 1.9.1
Reporter: ShenDa
 Fix For: 1.9.2


At present, Flink can use multiple sinks to write data into different data 
source such as HBase,Kafka,Elasticsearch,etc.And this process is concurrent ,in 
other words, one record will be written into data sources simultaneously.

But there is no approach that can sinking data serially.We really wish Flink 
can providing this kind of ability that a sink can write data into target 
database only after the previous sink transfers data successfully.And if the 
previous sink encounters any exception, the next sink will not work.
h1.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-13661) Add a stream specific CREATE TABLE SQL DDL

2019-09-22 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16935513#comment-16935513
 ] 

ShenDa edited comment on FLINK-13661 at 9/23/19 2:27 AM:
-

OK thanks, [~jark]. Where could we find or join this design discuss?


was (Author: dadashen):
OK thanks, [~jark]. Where cloud we find or join this design discuss?

> Add a stream specific CREATE TABLE SQL DDL
> --
>
> Key: FLINK-13661
> URL: https://issues.apache.org/jira/browse/FLINK-13661
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.10.0
>
>
> FLINK-6962 has introduced a basic SQL DDL to create a table. However, it 
> doesn't support stream specific features, for example, watermark definition, 
> changeflag definition, computed columns, primary keys and so on. 
> We started a design doc[1] to discuss the concepts of source and sink to help 
> us have a well-defined DDL. Once the FLIP is accepted, we can start the work.
> [1]: 
> https://docs.google.com/document/d/1yrKXEIRATfxHJJ0K3t6wUgXAtZq8D-XgvEnvl2uUcr0/edit#heading=h.c05t427gfgxa
> Considering the streaming DDL is a huge topic, we decided to split it into 
> separate FLIPs. Here comes the first one.
> FLIP-66: Support time attribute in SQL DDL: 
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-13661) Add a stream specific CREATE TABLE SQL DDL

2019-09-22 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16935513#comment-16935513
 ] 

ShenDa commented on FLINK-13661:


OK thanks, [~jark]. Where cloud we find or join this design discuss?

> Add a stream specific CREATE TABLE SQL DDL
> --
>
> Key: FLINK-13661
> URL: https://issues.apache.org/jira/browse/FLINK-13661
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.10.0
>
>
> FLINK-6962 has introduced a basic SQL DDL to create a table. However, it 
> doesn't support stream specific features, for example, watermark definition, 
> changeflag definition, computed columns, primary keys and so on. 
> We started a design doc[1] to discuss the concepts of source and sink to help 
> us have a well-defined DDL. Once the FLIP is accepted, we can start the work.
> [1]: 
> https://docs.google.com/document/d/1yrKXEIRATfxHJJ0K3t6wUgXAtZq8D-XgvEnvl2uUcr0/edit#heading=h.c05t427gfgxa
> Considering the streaming DDL is a huge topic, we decided to split it into 
> separate FLIPs. Here comes the first one.
> FLIP-66: Support time attribute in SQL DDL: 
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-13661) Add a stream specific CREATE TABLE SQL DDL

2019-09-22 Thread ShenDa (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16935501#comment-16935501
 ] 

ShenDa commented on FLINK-13661:


Hi  all, is there any progress or issue of combined table that combines 
historical and changelog data. It's really useful for us to use this kind of 
table.

> Add a stream specific CREATE TABLE SQL DDL
> --
>
> Key: FLINK-13661
> URL: https://issues.apache.org/jira/browse/FLINK-13661
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.10.0
>
>
> FLINK-6962 has introduced a basic SQL DDL to create a table. However, it 
> doesn't support stream specific features, for example, watermark definition, 
> changeflag definition, computed columns, primary keys and so on. 
> We started a design doc[1] to discuss the concepts of source and sink to help 
> us have a well-defined DDL. Once the FLIP is accepted, we can start the work.
> [1]: 
> https://docs.google.com/document/d/1yrKXEIRATfxHJJ0K3t6wUgXAtZq8D-XgvEnvl2uUcr0/edit#heading=h.c05t427gfgxa
> Considering the streaming DDL is a huge topic, we decided to split it into 
> separate FLIPs. Here comes the first one.
> FLIP-66: Support time attribute in SQL DDL: 
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

52 matches

Mail list logo