[jira] [Comment Edited] (FLINK-4758) Remove IOReadableWritable from classes where not needed

2016-12-27 Thread jiwengang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15782230#comment-15782230
 ] 

jiwengang edited comment on FLINK-4758 at 12/28/16 6:53 AM:


just remove the interface?  Does the implementation classes still have to keep 
the method in *IOReadableWritable*?


was (Author: jeffreyji666):
just remove the interface?  Does the implementation classes still have to keep 
the method in **IOReadableWritable**?

> Remove IOReadableWritable from classes where not needed
> ---
>
> Key: FLINK-4758
> URL: https://issues.apache.org/jira/browse/FLINK-4758
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
> Fix For: 2.0.0
>
>
> Many classes implement for historic reasons the {{IOReadableWritable}} 
> interface, where it is not needed any more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4758) Remove IOReadableWritable from classes where not needed

2016-12-27 Thread jiwengang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15782230#comment-15782230
 ] 

jiwengang edited comment on FLINK-4758 at 12/28/16 6:50 AM:


just remove the interface?  Does the implementation classes still have to keep 
the method in **IOReadableWritable**?


was (Author: jeffreyji666):
just remove the interface?  Does the implementation classes still have to keep 
the method in IOReadableWritable?

> Remove IOReadableWritable from classes where not needed
> ---
>
> Key: FLINK-4758
> URL: https://issues.apache.org/jira/browse/FLINK-4758
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
> Fix For: 2.0.0
>
>
> Many classes implement for historic reasons the {{IOReadableWritable}} 
> interface, where it is not needed any more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4758) Remove IOReadableWritable from classes where not needed

2016-12-27 Thread jiwengang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15782230#comment-15782230
 ] 

jiwengang commented on FLINK-4758:
--

just remove the interface?  Does the implementation classes still have to keep 
the method in IOReadableWritable?

> Remove IOReadableWritable from classes where not needed
> ---
>
> Key: FLINK-4758
> URL: https://issues.apache.org/jira/browse/FLINK-4758
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
> Fix For: 2.0.0
>
>
> Many classes implement for historic reasons the {{IOReadableWritable}} 
> interface, where it is not needed any more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #3048: Clarified the import path of the Breeze DenseVector

2016-12-27 Thread Fokko
Github user Fokko commented on the issue:

https://github.com/apache/flink/pull/3048
  
Locally the tests just pass. Looking at the error logs, it doesn't have to 
do with the changes in the PR, for example:
```
java.io.FileNotFoundException: build-target/lib/flink-dist-*.jar (No such 
file or directory)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:220)
at java.util.zip.ZipFile.(ZipFile.java:150)
at java.util.zip.ZipFile.(ZipFile.java:121)
at sun.tools.jar.Main.list(Main.java:1060)
at sun.tools.jar.Main.run(Main.java:291)
at sun.tools.jar.Main.main(Main.java:1233)
find: `./flink-yarn-tests/target/flink-yarn-tests*': No such file or 
directory
```





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-4920) Add a Scala Function Gauge

2016-12-27 Thread Pattarawat Chormai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781253#comment-15781253
 ] 

Pattarawat Chormai edited comment on FLINK-4920 at 12/27/16 9:04 PM:
-

[~Zentol] Thanks for your suggestion.

[~StephanEwen] could you please assign me to the issue?


was (Author: heytitle):
@Chesney Thanks for your suggestion.

@Stephan, could you please assign me to the issue?

> Add a Scala Function Gauge
> --
>
> Key: FLINK-4920
> URL: https://issues.apache.org/jira/browse/FLINK-4920
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics, Scala API
>Reporter: Stephan Ewen
>  Labels: easyfix, starter
>
> A useful metrics utility for the Scala API would be to add a Gauge that 
> obtains its value by calling a Scala Function0.
> That way, one can add Gauges in Scala programs using Scala lambda notation or 
> function references.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4920) Add a Scala Function Gauge

2016-12-27 Thread Pattarawat Chormai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781253#comment-15781253
 ] 

Pattarawat Chormai commented on FLINK-4920:
---

@Chesney Thanks for your suggestion.

@Stephan, could you please assign me to the issue?

> Add a Scala Function Gauge
> --
>
> Key: FLINK-4920
> URL: https://issues.apache.org/jira/browse/FLINK-4920
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics, Scala API
>Reporter: Stephan Ewen
>  Labels: easyfix, starter
>
> A useful metrics utility for the Scala API would be to add a Gauge that 
> obtains its value by calling a Scala Function0.
> That way, one can add Gauges in Scala programs using Scala lambda notation or 
> function references.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780914#comment-15780914
 ] 

ASF GitHub Bot commented on FLINK-5280:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/3039
  
Hi @fhueske, @wuchong 

Thank you for your reviews.

I think creating three abstract classes is a good idea, since we don't 
expect any new types of table sources, so there will not be a lot of 
combinations.

I'll try to update the PR tomorrow according to all current comments.


> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #3039: [FLINK-5280] Update TableSource to support nested data

2016-12-27 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/3039
  
Hi @fhueske, @wuchong 

Thank you for your reviews.

I think creating three abstract classes is a good idea, since we don't 
expect any new types of table sources, so there will not be a lot of 
combinations.

I'll try to update the PR tomorrow according to all current comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780667#comment-15780667
 ] 

ASF GitHub Bot commented on FLINK-5280:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3039
  
Thanks for working on this @mushketyk and the reviews @wuchong.
I just add a comment regarding the Scala trait with implemented function. 
I'll do a more thorough review in the next days.

Thanks, Fabian


> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #3039: [FLINK-5280] Update TableSource to support nested data

2016-12-27 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3039
  
Thanks for working on this @mushketyk and the reviews @wuchong.
I just add a comment regarding the Scala trait with implemented function. 
I'll do a more thorough review in the next days.

Thanks, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3039: [FLINK-5280] Update TableSource to support nested ...

2016-12-27 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3039#discussion_r93943868
  
--- Diff: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/KafkaTableSource.java
 ---
@@ -111,24 +112,17 @@
return kafkaSource;
}
 
-   @Override
-   public int getNumberOfFields() {
-   return fieldNames.length;
-   }
-
-   @Override
public String[] getFieldsNames() {
-   return fieldNames;
+   return TableSource$class.getFieldsNames(this);
--- End diff --

Hi,

I'm not sure about implementing this as a Scala trait with implemented 
methods. IMO, this makes it much harder to implement TableSources in Java (esp. 
for users who are not familiar with Scala and its implications). 

What do you think about implementing `TableSource` as abstract class and 
providing three other abstract classes that extend `TableSource` with the 
batch, the stream, and both interfaces?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780661#comment-15780661
 ] 

ASF GitHub Bot commented on FLINK-5395:
---

GitHub user shijinkui opened a pull request:

https://github.com/apache/flink/pull/3049

[FLINK-5395] [Build System] support locally build distribution by script 
create_release_files.sh

create_release_files.sh is build flink release only. It's hard to build 
custom local Flink release distribution.
Let create_release_files.sh support:
1. custom git repo url
2. custom build special scala and hadoop version
3. add `tools/flink` to .gitignore
4. add usage
- [X] General
  - The pull request references the related JIRA issue ("[FLINK-5395] 
support locally build distribution by script create_release_files.sh")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shijinkui/flink FLINK-5395

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3049.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3049


commit 3b41c0942ef7ddd5921a32afbee2133392a594b7
Author: shijinkui 
Date:   2016-12-27T15:51:10Z

[FLINK-5395] [Build System] support locally build distribution by script 
create_release_files.sh




> support locally build distribution by script create_release_files.sh
> 
>
> Key: FLINK-5395
> URL: https://issues.apache.org/jira/browse/FLINK-5395
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> create_release_files.sh is build flink release only. It's hard to build 
> custom local Flink release distribution.
> Let create_release_files.sh support:
> 1. custom git repo url
> 2. custom build special scala and hadoop version
> 3. add `tools/flink` to .gitignore
> 4. add usage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780663#comment-15780663
 ] 

ASF GitHub Bot commented on FLINK-5280:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3039#discussion_r93943868
  
--- Diff: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/KafkaTableSource.java
 ---
@@ -111,24 +112,17 @@
return kafkaSource;
}
 
-   @Override
-   public int getNumberOfFields() {
-   return fieldNames.length;
-   }
-
-   @Override
public String[] getFieldsNames() {
-   return fieldNames;
+   return TableSource$class.getFieldsNames(this);
--- End diff --

Hi,

I'm not sure about implementing this as a Scala trait with implemented 
methods. IMO, this makes it much harder to implement TableSources in Java (esp. 
for users who are not familiar with Scala and its implications). 

What do you think about implementing `TableSource` as abstract class and 
providing three other abstract classes that extend `TableSource` with the 
batch, the stream, and both interfaces?


> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...

2016-12-27 Thread shijinkui
GitHub user shijinkui opened a pull request:

https://github.com/apache/flink/pull/3049

[FLINK-5395] [Build System] support locally build distribution by script 
create_release_files.sh

create_release_files.sh is build flink release only. It's hard to build 
custom local Flink release distribution.
Let create_release_files.sh support:
1. custom git repo url
2. custom build special scala and hadoop version
3. add `tools/flink` to .gitignore
4. add usage
- [X] General
  - The pull request references the related JIRA issue ("[FLINK-5395] 
support locally build distribution by script create_release_files.sh")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shijinkui/flink FLINK-5395

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3049.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3049


commit 3b41c0942ef7ddd5921a32afbee2133392a594b7
Author: shijinkui 
Date:   2016-12-27T15:51:10Z

[FLINK-5395] [Build System] support locally build distribution by script 
create_release_files.sh




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-5395) support locally build distribution by script create_release_files.sh

2016-12-27 Thread shijinkui (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shijinkui updated FLINK-5395:
-
Description: 
create_release_files.sh is build flink release only. It's hard to build custom 
local Flink release distribution.

Let create_release_files.sh support:
1. custom git repo url
2. custom build special scala and hadoop version
3. add `tools/flink` to .gitignore
4. add usage


  was:
create_release_files.sh is build flink release only. It's hard to build custom 
local Flink release distribution.

Let create_release_files.sh support:
1. custom git repo url
2. custom build special scala and hadoop version
3. fix flink-dist opt.xml have no replace the scala version by 
change-scala-version.sh


> support locally build distribution by script create_release_files.sh
> 
>
> Key: FLINK-5395
> URL: https://issues.apache.org/jira/browse/FLINK-5395
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> create_release_files.sh is build flink release only. It's hard to build 
> custom local Flink release distribution.
> Let create_release_files.sh support:
> 1. custom git repo url
> 2. custom build special scala and hadoop version
> 3. add `tools/flink` to .gitignore
> 4. add usage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #3048: Clarified the import path of the Breeze DenseVecto...

2016-12-27 Thread Fokko
GitHub user Fokko opened a pull request:

https://github.com/apache/flink/pull/3048

Clarified the import path of the Breeze DenseVector

Guys,

I'm working on an extension of the ml library on Flink, but I stumbled upon 
this. Since it is such a trivial fix, I didn't created a JIRA request. Keep up 
the good work!

Cheers,

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Fokko/flink fd-cleanup-package-structure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3048.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3048


commit 3fd38fe9785d607a05d045cd54a05af9ed48e350
Author: Fokko Driesprong 
Date:   2016-12-27T14:43:15Z

Replaced the full import path with the BreezeDenseVector itself to make it 
more readable




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-5376) Misleading log statements in UnorderedStreamElementQueue

2016-12-27 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-5376:
--
Description: 
The following are two examples where ordered stream element queue is mentioned:
{code}
LOG.debug("Put element into ordered stream element queue. New filling 
degree " +
  "({}/{}).", numberEntries, capacity);

return true;
  } else {
LOG.debug("Failed to put element into ordered stream element queue 
because it " +
{code}

I guess OrderedStreamElementQueue was coded first.

  was:
The following are two examples where ordered stream element queue is mentioned:
{code}
LOG.debug("Put element into ordered stream element queue. New filling 
degree " +
  "({}/{}).", numberEntries, capacity);

return true;
  } else {
LOG.debug("Failed to put element into ordered stream element queue 
because it " +
{code}
I guess OrderedStreamElementQueue was coded first.


> Misleading log statements in UnorderedStreamElementQueue
> 
>
> Key: FLINK-5376
> URL: https://issues.apache.org/jira/browse/FLINK-5376
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> The following are two examples where ordered stream element queue is 
> mentioned:
> {code}
> LOG.debug("Put element into ordered stream element queue. New filling 
> degree " +
>   "({}/{}).", numberEntries, capacity);
> return true;
>   } else {
> LOG.debug("Failed to put element into ordered stream element queue 
> because it " +
> {code}
> I guess OrderedStreamElementQueue was coded first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3849) Add FilterableTableSource interface and translation rule

2016-12-27 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780544#comment-15780544
 ] 

Fabian Hueske commented on FLINK-3849:
--

Do you mean we need a single rule for pushing projection and filters into a 
{{BatchTableSourceScan}}, so basically extending the existing 
{{PushProjectIntoBatchTableSourceScanRule}} and 
{{PushProjectIntoStreamTableSourceScanRule}}?

Can you explain why it would not be possible to have to separate rules?

Thanks, Fabian



> Add FilterableTableSource interface and translation rule
> 
>
> Key: FLINK-3849
> URL: https://issues.apache.org/jira/browse/FLINK-3849
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Assignee: Anton Solovev
>
> Add a {{FilterableTableSource}} interface for {{TableSource}} implementations 
> which support filter push-down.
> The interface could look as follows
> {code}
> def trait FilterableTableSource {
>   // returns unsupported predicate expression
>   def setPredicate(predicate: Expression): Expression
> }
> {code}
> In addition we need Calcite rules to push a predicate (or parts of it) into a 
> TableScan that refers to a {{FilterableTableSource}}. We might need to tweak 
> the cost model as well to push the optimizer in the right direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5358) Support RowTypeInfo extraction in TypeExtractor by Row instance

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780533#comment-15780533
 ] 

ASF GitHub Bot commented on FLINK-5358:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3027
  
Thanks @tonycox. PR is good to merge.


> Support RowTypeInfo extraction in TypeExtractor by Row instance
> ---
>
> Key: FLINK-5358
> URL: https://issues.apache.org/jira/browse/FLINK-5358
> Project: Flink
>  Issue Type: Improvement
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> {code}
> Row[] data = new Row[]{};
> TypeInformation typeInfo = TypeExtractor.getForObject(data[0]);
> {code}
> method {{getForObject}} wraps it into
> {code}
> GenericTypeInfo
> {code}
> the method should return {{RowTypeInfo}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #3027: [FLINK-5358] add RowTypeInfo exctraction in TypeExtractor

2016-12-27 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3027
  
Thanks @tonycox. PR is good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3027: [FLINK-5358] add RowTypeInfo exctraction in TypeEx...

2016-12-27 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3027#discussion_r93937117
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/java/typeutils/TypeExtractorTest.java
 ---
@@ -345,8 +346,25 @@ public CustomType cross(CustomType first, Integer 
second) throws Exception {
 

Assert.assertFalse(TypeExtractor.getForClass(PojoWithNonPublicDefaultCtor.class)
 instanceof PojoTypeInfo);
}
-   
 
+   @Test
+   public void testRow() {
--- End diff --

Oh, I'm sorry. I overlooked that check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5358) Support RowTypeInfo extraction in TypeExtractor by Row instance

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780528#comment-15780528
 ] 

ASF GitHub Bot commented on FLINK-5358:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3027#discussion_r93937117
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/java/typeutils/TypeExtractorTest.java
 ---
@@ -345,8 +346,25 @@ public CustomType cross(CustomType first, Integer 
second) throws Exception {
 

Assert.assertFalse(TypeExtractor.getForClass(PojoWithNonPublicDefaultCtor.class)
 instanceof PojoTypeInfo);
}
-   
 
+   @Test
+   public void testRow() {
--- End diff --

Oh, I'm sorry. I overlooked that check.


> Support RowTypeInfo extraction in TypeExtractor by Row instance
> ---
>
> Key: FLINK-5358
> URL: https://issues.apache.org/jira/browse/FLINK-5358
> Project: Flink
>  Issue Type: Improvement
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> {code}
> Row[] data = new Row[]{};
> TypeInformation typeInfo = TypeExtractor.getForObject(data[0]);
> {code}
> method {{getForObject}} wraps it into
> {code}
> GenericTypeInfo
> {code}
> the method should return {{RowTypeInfo}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5392) flink-dist build failed when change scala version to 2.11

2016-12-27 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780523#comment-15780523
 ] 

Fabian Hueske commented on FLINK-5392:
--

There is another PR addressing this issue: 
https://github.com/apache/flink/pull/3047

> flink-dist build failed when change scala version to 2.11
> -
>
> Key: FLINK-5392
> URL: https://issues.apache.org/jira/browse/FLINK-5392
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0
> Environment: jdk 1.8.112
>Reporter: 刘喆
>
> when using 
> tools/change-scala-version.sh  2.11 
> then building will fail at flink-dist.
> Because some scala verion string is hard coded in 
> flink-dist/src/main/assemblies/opt.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-5396) flink-dist replace scala version in opt.xml by change-scala-version.sh

2016-12-27 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-5396.

Resolution: Duplicate

Duplicate of FLINK-5392.
I'll add a link to PR [#3047|https://github.com/apache/flink/pull/3047] to 
FLINK-5392.

> flink-dist replace scala version in opt.xml by change-scala-version.sh
> --
>
> Key: FLINK-5396
> URL: https://issues.apache.org/jira/browse/FLINK-5396
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> flink-dist have configured for replacing bin.xml, but not opt.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4088) Add interface to save and load TableSources

2016-12-27 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780452#comment-15780452
 ] 

Fabian Hueske commented on FLINK-4088:
--

A Scala trait can be used as a Java interface if it does not have member 
variables and implemented methods.
It would be good if the Scala trait fulfills these condition.

> Add interface to save and load TableSources
> ---
>
> Key: FLINK-4088
> URL: https://issues.apache.org/jira/browse/FLINK-4088
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Minudika Malshan
>
> Add an interface to save and load table sources similar to Java's 
> {{Serializable}} interface. 
> TableSources should implement the interface to become saveable and loadable.
> This could be used as follows:
> {code}
> val cts = new CsvTableSource(
>   "/path/to/csvfile",
>   Array("name", "age", "address"),
>   Array(BasicTypeInfo.STRING_TYPEINFO, ...),
>   ...
> )
> cts.saveToFile("/path/to/tablesource/file")
> // -
> val tEnv: TableEnvironment = ???
> tEnv.loadTableSource("persons", "/path/to/tablesource/file")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5396) flink-dist replace scala version in opt.xml by change-scala-version.sh

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780308#comment-15780308
 ] 

ASF GitHub Bot commented on FLINK-5396:
---

GitHub user shijinkui opened a pull request:

https://github.com/apache/flink/pull/3047

[FLINK-5396] [Build System] flink-dist replace scala version in opt.x…

flink-dist have configured for replacing bin.xml, but not opt.xml

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-5396] 
flink-dist replace scala version in opt.xml by change-scala-version.sh")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shijinkui/flink FLINK-5396

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3047.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3047


commit d8dc5015079dd0780c1831aa8b52d82e5a9bbbed
Author: shijinkui 
Date:   2016-12-27T12:29:16Z

[FLINK-5396] [Build System] flink-dist replace scala version in opt.xml by 
change-scala-version.sh




> flink-dist replace scala version in opt.xml by change-scala-version.sh
> --
>
> Key: FLINK-5396
> URL: https://issues.apache.org/jira/browse/FLINK-5396
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> flink-dist have configured for replacing bin.xml, but not opt.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #3047: [FLINK-5396] [Build System] flink-dist replace sca...

2016-12-27 Thread shijinkui
GitHub user shijinkui opened a pull request:

https://github.com/apache/flink/pull/3047

[FLINK-5396] [Build System] flink-dist replace scala version in opt.x…

flink-dist have configured for replacing bin.xml, but not opt.xml

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-5396] 
flink-dist replace scala version in opt.xml by change-scala-version.sh")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shijinkui/flink FLINK-5396

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3047.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3047


commit d8dc5015079dd0780c1831aa8b52d82e5a9bbbed
Author: shijinkui 
Date:   2016-12-27T12:29:16Z

[FLINK-5396] [Build System] flink-dist replace scala version in opt.xml by 
change-scala-version.sh




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-5396) flink-dist replace scala version in opt.xml by change-scala-version.sh

2016-12-27 Thread shijinkui (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shijinkui updated FLINK-5396:
-
Component/s: Build System

> flink-dist replace scala version in opt.xml by change-scala-version.sh
> --
>
> Key: FLINK-5396
> URL: https://issues.apache.org/jira/browse/FLINK-5396
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> flink-dist have configured for replacing bin.xml, but not opt.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5396) flink-dist replace scala version in opt.xml by change-scala-version.sh

2016-12-27 Thread shijinkui (JIRA)
shijinkui created FLINK-5396:


 Summary: flink-dist replace scala version in opt.xml by 
change-scala-version.sh
 Key: FLINK-5396
 URL: https://issues.apache.org/jira/browse/FLINK-5396
 Project: Flink
  Issue Type: Improvement
Reporter: shijinkui


flink-dist have configured for replacing bin.xml, but not opt.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5395) support locally build distribution by script create_release_files.sh

2016-12-27 Thread shijinkui (JIRA)
shijinkui created FLINK-5395:


 Summary: support locally build distribution by script 
create_release_files.sh
 Key: FLINK-5395
 URL: https://issues.apache.org/jira/browse/FLINK-5395
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: shijinkui


create_release_files.sh is build flink release only. It's hard to build custom 
local Flink release distribution.

Let create_release_files.sh support:
1. custom git repo url
2. custom build special scala and hadoop version
3. fix flink-dist opt.xml have no replace the scala version by 
change-scala-version.sh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5394) the estimateRowCount method of DataSetCalc didn't work

2016-12-27 Thread zhangjing (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjing updated FLINK-5394:
-
Description: 
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
  DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. Didn't provide custom metadataProvider yet. So when DataSetAggregate calls 
RelMetadataQuery.getRowCount(DataSetCalc) to estimate its input rowcount which 
would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel. So previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

The question would also appear to all Flink RelNodes which are subclass of 
SingleRel.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.

  was:
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
  DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. Didn't provide custom metadataProvider yet. So when DataSetAggregate calls 
RelMetadataQuery.getRowCount(DataSetCalc) to estimate its input rowcount which 
would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel. So previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.


> the estimateRowCount method of DataSetCalc didn't work
> --
>
> Key: FLINK-5394
> URL: https://issues.apache.org/jira/browse/FLINK-5394
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: zhangjing
>Assignee: zhangjing
>
> The estimateRowCount method of DataSetCalc didn't work now. 
> If I run the following code,
> `
> Table table = tableEnv
>   .fromDataSet(data, "a, b, c")
>   .groupBy("a")
>   .select("a, a.avg, b.sum, c.count")
>   .where("a == 1");
> `
> the cost of every node in Optimized node tree is :
> `
> DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
> COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 
> 5000.0 cpu, 28000.0 io}
>   DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, 
> cumulative cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
>   DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
> cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
> `
> We expect the input rowcount of DataSetAggregate less than 1000, however the 
> actual input rowcount is still 1000 because the the estimateRowCount method 
> of DataSetCalc didn't work. 
> There are two reasons caused to this:
> 1. Didn't provide custom metadataProvider yet. So when DataSetAggregate calls 
> RelMetadataQuery.getRowCount(DataSetCalc) to estimate its input rowcount 
> which would dispatch to RelM

[jira] [Updated] (FLINK-5394) the estimateRowCount method of DataSetCalc didn't work

2016-12-27 Thread zhangjing (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjing updated FLINK-5394:
-
Description: 
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
  DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. Didn't provide custom metadataProvider yet. So when DataSetAggregate calls 
RelMetadataQuery.getRowCount(DataSetCalc) to estimate its input rowcount which 
would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel. So previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.

  was:
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
  DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
estimate its input rowcount which would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel, so previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.


> the estimateRowCount method of DataSetCalc didn't work
> --
>
> Key: FLINK-5394
> URL: https://issues.apache.org/jira/browse/FLINK-5394
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: zhangjing
>Assignee: zhangjing
>
> The estimateRowCount method of DataSetCalc didn't work now. 
> If I run the following code,
> `
> Table table = tableEnv
>   .fromDataSet(data, "a, b, c")
>   .groupBy("a")
>   .select("a, a.avg, b.sum, c.count")
>   .where("a == 1");
> `
> the cost of every node in Optimized node tree is :
> `
> DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
> COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 
> 5000.0 cpu, 28000.0 io}
>   DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, 
> cumulative cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
>   DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
> cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
> `
> We expect the input rowcount of DataSetAggregate less than 1000, however the 
> actual input rowcount is still 1000 because the the estimateRowCount method 
> of DataSetCalc didn't work. 
> There are two reasons caused to this:
> 1. Didn't provide custom metadataProvider yet. So when DataSetAggregate calls 
> RelMetadataQuery.getRowCount(DataSetCalc) to estimate its input rowcount 
> which would dispatch to RelMdRowCount.
> 2. DataSetCalc is subclass of SingleRel. So previous function call would 
> match getRowCount(SingleRel rel, RelMetadataQue

[jira] [Updated] (FLINK-5394) the estimateRowCount method of DataSetCalc didn't work

2016-12-27 Thread zhangjing (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjing updated FLINK-5394:
-
Description: 
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
  DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
estimate its input rowcount which would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel, so previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.

  was:
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
|_  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, 
cumulative cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
 |_ DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
estimate its input rowcount which would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel, so previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.


> the estimateRowCount method of DataSetCalc didn't work
> --
>
> Key: FLINK-5394
> URL: https://issues.apache.org/jira/browse/FLINK-5394
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: zhangjing
>Assignee: zhangjing
>
> The estimateRowCount method of DataSetCalc didn't work now. 
> If I run the following code,
> `
> Table table = tableEnv
>   .fromDataSet(data, "a, b, c")
>   .groupBy("a")
>   .select("a, a.avg, b.sum, c.count")
>   .where("a == 1");
> `
> the cost of every node in Optimized node tree is :
> `
> DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
> COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 
> 5000.0 cpu, 28000.0 io}
>   DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, 
> cumulative cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
>   DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
> cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
> `
> We expect the input rowcount of DataSetAggregate less than 1000, however the 
> actual input rowcount is still 1000 because the the estimateRowCount method 
> of DataSetCalc didn't work. 
> There are two reasons caused to this:
> 1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
> estimate its input rowcount which would dispatch to RelMdRowCount.
> 2. DataSetCalc is subclass of SingleRel, so previous function call would 
> match getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
> DataSetCalc.estimateRowCount.
> I plan to resolve this problem

[jira] [Updated] (FLINK-5394) the estimateRowCount method of DataSetCalc didn't work

2016-12-27 Thread zhangjing (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjing updated FLINK-5394:
-
Description: 
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
|_  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, 
cumulative cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
 |_ DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
estimate its input rowcount which would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel, so previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.

  was:
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative cost 
= {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
estimate its input rowcount which would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel, so previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.


> the estimateRowCount method of DataSetCalc didn't work
> --
>
> Key: FLINK-5394
> URL: https://issues.apache.org/jira/browse/FLINK-5394
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: zhangjing
>Assignee: zhangjing
>
> The estimateRowCount method of DataSetCalc didn't work now. 
> If I run the following code,
> `
> Table table = tableEnv
>   .fromDataSet(data, "a, b, c")
>   .groupBy("a")
>   .select("a, a.avg, b.sum, c.count")
>   .where("a == 1");
> `
> the cost of every node in Optimized node tree is :
> `
> DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
> COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 
> 5000.0 cpu, 28000.0 io}
> |_  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, 
> cumulative cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
>  |_ DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
> cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
> `
> We expect the input rowcount of DataSetAggregate less than 1000, however the 
> actual input rowcount is still 1000 because the the estimateRowCount method 
> of DataSetCalc didn't work. 
> There are two reasons caused to this:
> 1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
> estimate its input rowcount which would dispatch to RelMdRowCount.
> 2. DataSetCalc is subclass of SingleRel, so previous function call would 
> match getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
> DataSetCalc.estimateRowCount.
> I plan to resolve this probl

[jira] [Updated] (FLINK-5394) the estimateRowCount method of DataSetCalc didn't work

2016-12-27 Thread zhangjing (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjing updated FLINK-5394:
-
Description: 
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative cost 
= {1000.0 rows, 1000.0 cpu, 0.0 io}
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
estimate its input rowcount which would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel, so previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.

  was:
The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}, id = 28
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}, id = 27
DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative cost 
= {1000.0 rows, 1000.0 cpu, 0.0 io}, id = 20
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
estimate its input rowcount which would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel, so previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.


> the estimateRowCount method of DataSetCalc didn't work
> --
>
> Key: FLINK-5394
> URL: https://issues.apache.org/jira/browse/FLINK-5394
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: zhangjing
>Assignee: zhangjing
>
> The estimateRowCount method of DataSetCalc didn't work now. 
> If I run the following code,
> `
> Table table = tableEnv
>   .fromDataSet(data, "a, b, c")
>   .groupBy("a")
>   .select("a, a.avg, b.sum, c.count")
>   .where("a == 1");
> `
> the cost of every node in Optimized node tree is :
> `
> DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
> COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 
> 5000.0 cpu, 28000.0 io}
>   DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, 
> cumulative cost = {2000.0 rows, 2000.0 cpu, 0.0 io}
> DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative 
> cost = {1000.0 rows, 1000.0 cpu, 0.0 io}
> `
> We expect the input rowcount of DataSetAggregate less than 1000, however the 
> actual input rowcount is still 1000 because the the estimateRowCount method 
> of DataSetCalc didn't work. 
> There are two reasons caused to this:
> 1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
> estimate its input rowcount which would dispatch to RelMdRowCount.
> 2. DataSetCalc is subclass of SingleRel, so previous function call would 
> match getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
> DataSetCalc.estimateRowCount.
> I plan to res

[jira] [Created] (FLINK-5394) the estimateRowCount method of DataSetCalc didn't work

2016-12-27 Thread zhangjing (JIRA)
zhangjing created FLINK-5394:


 Summary: the estimateRowCount method of DataSetCalc didn't work
 Key: FLINK-5394
 URL: https://issues.apache.org/jira/browse/FLINK-5394
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Reporter: zhangjing
Assignee: zhangjing


The estimateRowCount method of DataSetCalc didn't work now. 
If I run the following code,
`
Table table = tableEnv
.fromDataSet(data, "a, b, c")
.groupBy("a")
.select("a, a.avg, b.sum, c.count")
.where("a == 1");
`
the cost of every node in Optimized node tree is :
`
DataSetAggregate(groupBy=[a], select=[a, AVG(a) AS TMP_0, SUM(b) AS TMP_1, 
COUNT(c) AS TMP_2]): rowcount = 1000.0, cumulative cost = {3000.0 rows, 5000.0 
cpu, 28000.0 io}, id = 28
  DataSetCalc(select=[a, b, c], where=[=(a, 1)]): rowcount = 1000.0, cumulative 
cost = {2000.0 rows, 2000.0 cpu, 0.0 io}, id = 27
DataSetScan(table=[[_DataSetTable_0]]): rowcount = 1000.0, cumulative cost 
= {1000.0 rows, 1000.0 cpu, 0.0 io}, id = 20
`
We expect the input rowcount of DataSetAggregate less than 1000, however the 
actual input rowcount is still 1000 because the the estimateRowCount method of 
DataSetCalc didn't work. 

There are two reasons caused to this:
1. when DataSetAggregate calls RelMetadataQuery.getRowCount(DataSetCalc) to 
estimate its input rowcount which would dispatch to RelMdRowCount.
2. DataSetCalc is subclass of SingleRel, so previous function call would match 
getRowCount(SingleRel rel, RelMetadataQuery mq) which would never use 
DataSetCalc.estimateRowCount.

I plan to resolve this problem by adding a FlinkRelMdRowCount which contains 
specific getRowCount of Flink RelNodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #3036: [FLINK-5368] Throw exception if kafka topic doesn'...

2016-12-27 Thread HungUnicorn
GitHub user HungUnicorn reopened a pull request:

https://github.com/apache/flink/pull/3036

[FLINK-5368] Throw exception if kafka topic doesn't exist

As a developer when reading data from many topics, I want Kafka consumer to 
show something if any topic is not available. 

The motivation is we read many topics as list at one time, and sometimes we 
fail to recognize that one or two topics' names have been changed or 
deprecated, and Flink Kafka connector didn't show the error.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HungUnicorn/flink master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3036.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3036


commit b632edead770fd8386a65b6f67c739ad9c280a7c
Author: HungUnicorn 
Date:   2016-12-27T10:37:30Z

[FLINK-5368] log msg if kafka topic doesn't have any partitions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5368) Let Kafka consumer show something when it fails to read one topic out of topic list

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780140#comment-15780140
 ] 

ASF GitHub Bot commented on FLINK-5368:
---

GitHub user HungUnicorn reopened a pull request:

https://github.com/apache/flink/pull/3036

[FLINK-5368] Throw exception if kafka topic doesn't exist

As a developer when reading data from many topics, I want Kafka consumer to 
show something if any topic is not available. 

The motivation is we read many topics as list at one time, and sometimes we 
fail to recognize that one or two topics' names have been changed or 
deprecated, and Flink Kafka connector didn't show the error.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HungUnicorn/flink master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3036.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3036


commit b632edead770fd8386a65b6f67c739ad9c280a7c
Author: HungUnicorn 
Date:   2016-12-27T10:37:30Z

[FLINK-5368] log msg if kafka topic doesn't have any partitions




> Let Kafka consumer show something when it fails to read one topic out of 
> topic list
> ---
>
> Key: FLINK-5368
> URL: https://issues.apache.org/jira/browse/FLINK-5368
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Sendoh
>Assignee: Sendoh
>Priority: Critical
>
> As a developer when reading data from many topics, I want Kafka consumer to 
> show something if any topic is not available. The motivation is we read many 
> topics as list at one time, and sometimes we fail to recognize that one or 
> two topics' names have been changed or deprecated, and Flink Kafka connector 
> doesn't show the error.
> My proposed change would be either to throw RuntimeException or to use 
> LOG.error(topic + "doesn't have any partition") if partitionsForTopic is null 
> at this function. 
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka-0.9/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09.java#L208
> Any suggestion is welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5368) Let Kafka consumer show something when it fails to read one topic out of topic list

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780108#comment-15780108
 ] 

ASF GitHub Bot commented on FLINK-5368:
---

Github user HungUnicorn closed the pull request at:

https://github.com/apache/flink/pull/3036


> Let Kafka consumer show something when it fails to read one topic out of 
> topic list
> ---
>
> Key: FLINK-5368
> URL: https://issues.apache.org/jira/browse/FLINK-5368
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Sendoh
>Assignee: Sendoh
>Priority: Critical
>
> As a developer when reading data from many topics, I want Kafka consumer to 
> show something if any topic is not available. The motivation is we read many 
> topics as list at one time, and sometimes we fail to recognize that one or 
> two topics' names have been changed or deprecated, and Flink Kafka connector 
> doesn't show the error.
> My proposed change would be either to throw RuntimeException or to use 
> LOG.error(topic + "doesn't have any partition") if partitionsForTopic is null 
> at this function. 
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka-0.9/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09.java#L208
> Any suggestion is welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #3036: [FLINK-5368] Throw exception if kafka topic doesn'...

2016-12-27 Thread HungUnicorn
Github user HungUnicorn closed the pull request at:

https://github.com/apache/flink/pull/3036


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5368) Let Kafka consumer show something when it fails to read one topic out of topic list

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780093#comment-15780093
 ] 

ASF GitHub Bot commented on FLINK-5368:
---

Github user HungUnicorn commented on a diff in the pull request:

https://github.com/apache/flink/pull/3036#discussion_r93916373
  
--- Diff: 
flink-connectors/flink-connector-kafka-0.9/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09.java
 ---
@@ -208,13 +208,12 @@ public FlinkKafkaConsumer09(List topics, 
KeyedDeserializationSchema d
if (partitionsForTopic != null) {

partitions.addAll(convertToFlinkKafkaTopicPartition(partitionsForTopic));
}
+   else{
+   throw new RuntimeException("Unable to 
retrieve any partitions for the requested topics " + topic);
--- End diff --

Agree. I will change the exception message to INFO log message.


> Let Kafka consumer show something when it fails to read one topic out of 
> topic list
> ---
>
> Key: FLINK-5368
> URL: https://issues.apache.org/jira/browse/FLINK-5368
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Sendoh
>Assignee: Sendoh
>Priority: Critical
>
> As a developer when reading data from many topics, I want Kafka consumer to 
> show something if any topic is not available. The motivation is we read many 
> topics as list at one time, and sometimes we fail to recognize that one or 
> two topics' names have been changed or deprecated, and Flink Kafka connector 
> doesn't show the error.
> My proposed change would be either to throw RuntimeException or to use 
> LOG.error(topic + "doesn't have any partition") if partitionsForTopic is null 
> at this function. 
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka-0.9/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09.java#L208
> Any suggestion is welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #3036: [FLINK-5368] Throw exception if kafka topic doesn'...

2016-12-27 Thread HungUnicorn
Github user HungUnicorn commented on a diff in the pull request:

https://github.com/apache/flink/pull/3036#discussion_r93916373
  
--- Diff: 
flink-connectors/flink-connector-kafka-0.9/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09.java
 ---
@@ -208,13 +208,12 @@ public FlinkKafkaConsumer09(List topics, 
KeyedDeserializationSchema d
if (partitionsForTopic != null) {

partitions.addAll(convertToFlinkKafkaTopicPartition(partitionsForTopic));
}
+   else{
+   throw new RuntimeException("Unable to 
retrieve any partitions for the requested topics " + topic);
--- End diff --

Agree. I will change the exception message to INFO log message.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-4670) Add watch mechanism on current RPC framework

2016-12-27 Thread zhangjing (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjing closed FLINK-4670.

Resolution: Not A Problem

> Add watch mechanism on current RPC framework
> 
>
> Key: FLINK-4670
> URL: https://issues.apache.org/jira/browse/FLINK-4670
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: zhangjing
>Assignee: zhangjing
> Fix For: 1.2.0
>
>
> Add watch mechanism on current RPC framework so that RPC gateway could be 
> watched to make sure the rpc server is running just like previous DeathWatch 
> in akka



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)