[jira] [Commented] (FLINK-9061) S3 checkpoint data not partitioned well -- causes errors and poor performance

2018-03-24 Thread Steven Zhen Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412814#comment-16412814
 ] 

Steven Zhen Wu commented on FLINK-9061:
---

[~jgrier] Yes, we want to contribute this back. We can probably partner with 
you to get the change upstream.

> S3 checkpoint data not partitioned well -- causes errors and poor performance
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.4.2
>Reporter: Jamie Grier
>Priority: Critical
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9061) S3 checkpoint data not partitioned well -- causes errors and poor performance

2018-03-24 Thread Jamie Grier (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412737#comment-16412737
 ] 

Jamie Grier commented on FLINK-9061:


[~stevenz3wu] Did you contribute those changes back to Flink?  I think this 
will be affecting others as well.  Would you consider a contribution back to 
the project?  Otherwise I will do this but if you already have something 
working we might as well use it or base changes on it.

> S3 checkpoint data not partitioned well -- causes errors and poor performance
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.4.2
>Reporter: Jamie Grier
>Priority: Critical
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9069) Fix some double semicolons to single semicolons, and update checkstyle

2018-03-24 Thread Jacob Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Park updated FLINK-9069:
--
Description: 
As I was reading through the source code, I noticed that there were some double 
semicolons ";;".

These should be fixed.

Finally, the tools/maven/checkstyle.xml should be updated accordingly.

  was:
As I was reading through the source code, I noticed that there were some double 
semi-colons ";;".

These should be fixed.

Finally, the tools/maven/checkstyle.xml should be updated accordingly.


> Fix some double semicolons to single semicolons, and update checkstyle
> --
>
> Key: FLINK-9069
> URL: https://issues.apache.org/jira/browse/FLINK-9069
> Project: Flink
>  Issue Type: Task
>  Components: Checkstyle
>Reporter: Jacob Park
>Priority: Trivial
>  Labels: starter
>
> As I was reading through the source code, I noticed that there were some 
> double semicolons ";;".
> These should be fixed.
> Finally, the tools/maven/checkstyle.xml should be updated accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9069) Fix some double semicolons to single semicolons, and update checkstyle

2018-03-24 Thread Jacob Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Park updated FLINK-9069:
--
Summary: Fix some double semicolons to single semicolons, and update 
checkstyle  (was: Fix some double semi-colons to single semi-colons, and update 
checkstyle)

> Fix some double semicolons to single semicolons, and update checkstyle
> --
>
> Key: FLINK-9069
> URL: https://issues.apache.org/jira/browse/FLINK-9069
> Project: Flink
>  Issue Type: Task
>  Components: Checkstyle
>Reporter: Jacob Park
>Priority: Trivial
>  Labels: starter
>
> As I was reading through the source code, I noticed that there were some 
> double semi-colons ";;".
> These should be fixed.
> Finally, the tools/maven/checkstyle.xml should be updated accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9069) Fix some double semi-colons to single semi-colons, and update checkstyle

2018-03-24 Thread Jacob Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412690#comment-16412690
 ] 

Jacob Park commented on FLINK-9069:
---

Can I be assigned to this ticket? I want to use it learn the contribution 
process for Apache Flink.

> Fix some double semi-colons to single semi-colons, and update checkstyle
> 
>
> Key: FLINK-9069
> URL: https://issues.apache.org/jira/browse/FLINK-9069
> Project: Flink
>  Issue Type: Task
>  Components: Checkstyle
>Reporter: Jacob Park
>Priority: Trivial
>  Labels: starter
>
> As I was reading through the source code, I noticed that there were some 
> double semi-colons ";;".
> These should be fixed.
> Finally, the tools/maven/checkstyle.xml should be updated accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9069) Fix some double semi-colons to single semi-colons, and update checkstyle

2018-03-24 Thread Jacob Park (JIRA)
Jacob Park created FLINK-9069:
-

 Summary: Fix some double semi-colons to single semi-colons, and 
update checkstyle
 Key: FLINK-9069
 URL: https://issues.apache.org/jira/browse/FLINK-9069
 Project: Flink
  Issue Type: Task
  Components: Checkstyle
Reporter: Jacob Park


As I was reading through the source code, I noticed that there were some double 
semi-colons ";;".

These should be fixed.

Finally, the tools/maven/checkstyle.xml should be updated accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5760: [hotfix] [doc] fix maven version in building flink

2018-03-24 Thread bowenli86
GitHub user bowenli86 opened a pull request:

https://github.com/apache/flink/pull/5760

[hotfix] [doc] fix maven version in building flink

## What is the purpose of the change

The maven version in `start/building` is inconsistent. Make it consistent 
by changing the maven version to 3.0.4

## Brief change log

The maven version in `start/building` is inconsistent. Make it consistent 
by changing the maven version to 3.0.4

## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

none

## Documentation

none


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bowenli86/flink hotfix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5760.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5760


commit 8a17ccedf33bed267f4dcfeef185cc589fb70fe1
Author: Bowen Li 
Date:   2018-03-24T15:11:16Z

[hotfix] fix maven version in building flink




---


[GitHub] flink issue #5704: [FLINK-8852] [sql-client] Add FLIP-6 support to SQL Clien...

2018-03-24 Thread liurenjie1024
Github user liurenjie1024 commented on the issue:

https://github.com/apache/flink/pull/5704
  
LGTM


---


[jira] [Commented] (FLINK-8852) SQL Client does not work with new FLIP-6 mode

2018-03-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412633#comment-16412633
 ] 

ASF GitHub Bot commented on FLINK-8852:
---

Github user liurenjie1024 commented on the issue:

https://github.com/apache/flink/pull/5704
  
LGTM


> SQL Client does not work with new FLIP-6 mode
> -
>
> Key: FLINK-8852
> URL: https://issues.apache.org/jira/browse/FLINK-8852
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Affects Versions: 1.5.0
>Reporter: Fabian Hueske
>Assignee: Timo Walther
>Priority: Blocker
> Fix For: 1.5.0
>
>
> The SQL client does not submit queries to local Flink cluster that runs in 
> FLIP-6 mode. It doesn't throw an exception either.
> Job submission works if the legacy Flink cluster mode is used (`mode: old`)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9067) End-to-end test: Stream SQL query with failure and retry

2018-03-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412546#comment-16412546
 ] 

ASF GitHub Bot commented on FLINK-9067:
---

Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/5759#discussion_r176906809
  
--- Diff: flink-end-to-end-tests/test-scripts/test_streaming_sql.sh ---
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash

+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+source "$(dirname "$0")"/common.sh
+

+TEST_PROGRAM_JAR=$TEST_INFRA_DIR/../../flink-end-to-end-tests/flink-stream-sql-test/target/StreamSQLTestProgram.jar
+
+# copy flink-table jar into lib folder
+cp $FLINK_DIR/opt/flink-table*jar $FLINK_DIR/lib
--- End diff --

Oh. I think because you start taskmanager.sh, so you have to copy that. Not 
sure about it.


> End-to-end test: Stream SQL query with failure and retry
> 
>
> Key: FLINK-9067
> URL: https://issues.apache.org/jira/browse/FLINK-9067
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL, Tests
>Affects Versions: 1.5.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Major
>
> Implement a test job that runs a streaming SQL query with a temporary failure 
> and recovery.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5759: [FLINK-9067] [e2eTests] Add StreamSQLTestProgram a...

2018-03-24 Thread zhangminglei
Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/5759#discussion_r176906809
  
--- Diff: flink-end-to-end-tests/test-scripts/test_streaming_sql.sh ---
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash

+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+source "$(dirname "$0")"/common.sh
+

+TEST_PROGRAM_JAR=$TEST_INFRA_DIR/../../flink-end-to-end-tests/flink-stream-sql-test/target/StreamSQLTestProgram.jar
+
+# copy flink-table jar into lib folder
+cp $FLINK_DIR/opt/flink-table*jar $FLINK_DIR/lib
--- End diff --

Oh. I think because you start taskmanager.sh, so you have to copy that. Not 
sure about it.


---


[jira] [Commented] (FLINK-9067) End-to-end test: Stream SQL query with failure and retry

2018-03-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412543#comment-16412543
 ] 

ASF GitHub Bot commented on FLINK-9067:
---

Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/5759#discussion_r176906704
  
--- Diff: flink-end-to-end-tests/test-scripts/test_streaming_sql.sh ---
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash

+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+source "$(dirname "$0")"/common.sh
+

+TEST_PROGRAM_JAR=$TEST_INFRA_DIR/../../flink-end-to-end-tests/flink-stream-sql-test/target/StreamSQLTestProgram.jar
+
+# copy flink-table jar into lib folder
+cp $FLINK_DIR/opt/flink-table*jar $FLINK_DIR/lib
--- End diff --

Hi, @fhueske I have some small questions here. I can see that there are 
some jar files in the example folder(```$FLINK_DIR/example```), some jar files 
in the ```$FLINK_DIR/lib``` folder, and other jar files in the 
```$FLINK_DIR/opt``` folder. Why did you copy this jar file from opt  folder to 
lib folder, instead of putting to ```$FLINK_DIR/example``` ? Thanks.


> End-to-end test: Stream SQL query with failure and retry
> 
>
> Key: FLINK-9067
> URL: https://issues.apache.org/jira/browse/FLINK-9067
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL, Tests
>Affects Versions: 1.5.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Major
>
> Implement a test job that runs a streaming SQL query with a temporary failure 
> and recovery.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5759: [FLINK-9067] [e2eTests] Add StreamSQLTestProgram a...

2018-03-24 Thread zhangminglei
Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/5759#discussion_r176906704
  
--- Diff: flink-end-to-end-tests/test-scripts/test_streaming_sql.sh ---
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash

+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+source "$(dirname "$0")"/common.sh
+

+TEST_PROGRAM_JAR=$TEST_INFRA_DIR/../../flink-end-to-end-tests/flink-stream-sql-test/target/StreamSQLTestProgram.jar
+
+# copy flink-table jar into lib folder
+cp $FLINK_DIR/opt/flink-table*jar $FLINK_DIR/lib
--- End diff --

Hi, @fhueske I have some small questions here. I can see that there are 
some jar files in the example folder(```$FLINK_DIR/example```), some jar files 
in the ```$FLINK_DIR/lib``` folder, and other jar files in the 
```$FLINK_DIR/opt``` folder. Why did you copy this jar file from opt  folder to 
lib folder, instead of putting to ```$FLINK_DIR/example``` ? Thanks.


---


[jira] [Commented] (FLINK-9060) Deleting state using KeyedStateBackend.getKeys() throws Exception

2018-03-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412480#comment-16412480
 ] 

ASF GitHub Bot commented on FLINK-9060:
---

Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/5751
  
@kl0u Got it! Addressing...


> Deleting state using KeyedStateBackend.getKeys() throws Exception
> -
>
> Key: FLINK-9060
> URL: https://issues.apache.org/jira/browse/FLINK-9060
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Aljoscha Krettek
>Assignee: Sihua Zhou
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Adding this test to {{StateBackendTestBase}} showcases the problem:
> {code}
> @Test
> public void testConcurrentModificationWithGetKeys() throws Exception {
>   AbstractKeyedStateBackend backend = 
> createKeyedBackend(IntSerializer.INSTANCE);
>   try {
>   ListStateDescriptor listStateDescriptor =
>   new ListStateDescriptor<>("foo", 
> StringSerializer.INSTANCE);
>   backend.setCurrentKey(1);
>   backend
>   .getPartitionedState(VoidNamespace.INSTANCE, 
> VoidNamespaceSerializer.INSTANCE, listStateDescriptor)
>   .add("Hello");
>   backend.setCurrentKey(2);
>   backend
>   .getPartitionedState(VoidNamespace.INSTANCE, 
> VoidNamespaceSerializer.INSTANCE, listStateDescriptor)
>   .add("Ciao");
>   Stream keys = backend
>   .getKeys(listStateDescriptor.getName(), 
> VoidNamespace.INSTANCE);
>   keys.forEach((key) -> {
>   backend.setCurrentKey(key);
>   try {
>   backend
>   .getPartitionedState(
>   VoidNamespace.INSTANCE,
>   
> VoidNamespaceSerializer.INSTANCE,
>   listStateDescriptor)
>   .clear();
>   } catch (Exception e) {
>   e.printStackTrace();
>   }
>   });
>   }
>   finally {
>   IOUtils.closeQuietly(backend);
>   backend.dispose();
>   }
> }
> {code}
> This should work because one of the use cases of {{getKeys()}} and 
> {{applyToAllKeys()}} is to do stuff for every key, which includes deleting 
> them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #5751: [FLINK-9060][state] Deleting state using KeyedStateBacken...

2018-03-24 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/5751
  
@kl0u Got it! Addressing...


---