[jira] [Created] (FLINK-6225) Support Row Stream for CassandraSink

2017-03-30 Thread Jing Fan (JIRA)
Jing Fan created FLINK-6225:
---

 Summary: Support Row Stream for CassandraSink
 Key: FLINK-6225
 URL: https://issues.apache.org/jira/browse/FLINK-6225
 Project: Flink
  Issue Type: New Feature
  Components: Cassandra Connector
Affects Versions: 1.3.0
Reporter: Jing Fan
 Fix For: 1.3.0


Currently in CassandraSink, specifying query is not supported for row-stream. 
The solution should be similar to CassandraTupleSink.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [Bug]Question about StreamExecutionEnvironment.createRemoteEnvironment

2017-03-30 Thread canbinzheng
Hi, I have opened an issue for the  probelm
https://issues.apache.org/jira/browse/FLINK-6224



--
View this message in context: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Bug-Question-about-StreamExecutionEnvironment-createRemoteEnvironment-tp16854p16862.html
Sent from the Apache Flink Mailing List archive. mailing list archive at 
Nabble.com.


[jira] [Created] (FLINK-6224) RemoteStreamEnvironment not resolve hostname of JobManager

2017-03-30 Thread CanBin Zheng (JIRA)
CanBin Zheng created FLINK-6224:
---

 Summary: RemoteStreamEnvironment not resolve hostname of JobManager
 Key: FLINK-6224
 URL: https://issues.apache.org/jira/browse/FLINK-6224
 Project: Flink
  Issue Type: Bug
  Components: Client, DataStream API
Affects Versions: 1.2.0
Reporter: CanBin Zheng
Assignee: CanBin Zheng


I run two examples in the same client. 

first one use
ExecutionEnvironment.createRemoteEnvironment("10.75.203.170", 59551)
second one use
StreamExecutionEnvironment.createRemoteEnvironment("10.75.203.170", 59551)

the first one runs successfully, but the second example fails(connect to 
JobManager timeout), for the second one, if I change host parameter from ip to 
hostname, it works. 

I checked the source code and found, 
ExecutionEnvironment.createRemoteEnvironment 
resolves the given hostname, this will lookup the hostname for the given 
ip. In contrast, the StreamExecutionEnvironment.createRemoteEnvironment 
won't do this.

As Till Rohrmann mentioned,  the problem is that with 
FLINK-2821 [1], we can no longer resolve the hostname on the JobManager, so, 
we'd better resolve hostname for given ip in RemoteStreamEnvironment too.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6223) Rework PythonPlanBinder generics

2017-03-30 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6223:
---

 Summary: Rework PythonPlanBinder generics
 Key: FLINK-6223
 URL: https://issues.apache.org/jira/browse/FLINK-6223
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 1.3.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.3.0


The PythonPlanBinder is loaded with raw usages of parameterized objects and 
unchecked casts. We can rework this to be a bit more compiler friendly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6222) YARN: setting environment variables in an easier fashion

2017-03-30 Thread Craig Foster (JIRA)
Craig Foster created FLINK-6222:
---

 Summary: YARN: setting environment variables in an easier fashion
 Key: FLINK-6222
 URL: https://issues.apache.org/jira/browse/FLINK-6222
 Project: Flink
  Issue Type: Improvement
  Components: Startup Shell Scripts
Affects Versions: 1.2.0
 Environment: YARN, EMR
Reporter: Craig Foster


Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and 
sometimes FLINK_CONF_DIR.
For example, in [1], it is stated: 
“Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR 
environment variable to be set to read the YARN and HDFS configuration.” 

In BigTop, we set this with /etc/flink/default and then a wrapper is created to 
source that. However, this is slightly cumbersome and we don't have a central 
place within the Flink project itself to source environment variables. 
config.sh could do this but it doesn't have information about FLINK_CONF_DIR. 
For YARN and Hadoop variables, I already have a solution that would add 
"env.yarn.confdir" and "env.hadoop.confdir" variables to the flink-conf.yaml 
file and then we just symlink /etc/lib/flink/conf/ and /etc/flink/conf. 

But we could also add a flink-env.sh file to set these variables and decouple 
them from config.sh entirely. 

I'd like to know the opinion/preference of others and what would be more 
amenable. 




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] Release Apache Flink 1.2.1 (RC1)

2017-03-30 Thread Aljoscha Krettek
https://issues.apache.org/jira/browse/FLINK-6188 turns out to be a bit
more involved, see my comments on the PR:
https://github.com/apache/flink/pull/3616.

As I said there, maybe we should revert the commits regarding
parallelism/max-parallelism changes and release and then fix it later.

On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote:
> I commented on FLINK-6214: I think it's working as intended, although we
> could fix the javadoc/doc.
> 
> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote:
> > A user reported that all tumbling and slinding window assigners contain 
> > a pretty obvious bug about offsets.
> > 
> > https://issues.apache.org/jira/browse/FLINK-6214
> > 
> > I think we should also fix this for 1.2.1. What do you think?
> > 
> > Regards,
> > Timo
> > 
> > 
> > Am 29/03/17 um 11:30 schrieb Robert Metzger:
> > > Hi Haohui,
> > > I agree that we should fix the parallelism issue. Otherwise, the 1.2.1
> > > release would introduce a new bug.
> > >
> > > On Tue, Mar 28, 2017 at 11:59 PM, Haohui Mai  wrote:
> > >
> > >> -1 (non-binding)
> > >>
> > >> We recently found out that all jobs submitted via UI will have a
> > >> parallelism of 1, potentially due to FLINK-5808.
> > >>
> > >> Filed FLINK-6209 to track it.
> > >>
> > >> ~Haohui
> > >>
> > >> On Mon, Mar 27, 2017 at 2:59 AM Chesnay Schepler 
> > >> wrote:
> > >>
> > >>> If possible I would like to include FLINK-6183 & FLINK-6184 as well.
> > >>>
> > >>> They fix 2 metric-related issues that could arise when a Task is
> > >>> cancelled very early. (like, right away)
> > >>>
> > >>> FLINK-6183 fixes a memory leak where the TaskMetricGroup was never 
> > >>> closed
> > >>> FLINK-6184 fixes a NullPointerExceptions in the buffer metrics
> > >>>
> > >>> PR here: https://github.com/apache/flink/pull/3611
> > >>>
> > >>> On 26.03.2017 12:35, Aljoscha Krettek wrote:
> >  I opened a PR for FLINK-6188: https://github.com/apache/
> > >> flink/pull/3616
> > >>> 
> >  This improves the previously very sparse test coverage for
> > >>> timestamp/watermark assigners and fixes the bug.
> > > On 25 Mar 2017, at 10:22, Ufuk Celebi  wrote:
> > >
> > > I agree with Aljoscha.
> > >
> > > -1 because of FLINK-6188
> > >
> > >
> > > On Sat, Mar 25, 2017 at 9:38 AM, Aljoscha Krettek <
> > >> aljos...@apache.org>
> > >>> wrote:
> > >> I filed this issue, which was observed by a user:
> > >>> https://issues.apache.org/jira/browse/FLINK-6188
> > >> I think that’s blocking for 1.2.1.
> > >>
> > >>> On 24 Mar 2017, at 18:57, Ufuk Celebi  wrote:
> > >>>
> > >>> RC1 doesn't contain Stefan's backport for the Asynchronous snapshots
> > >>> for heap-based keyed state that has been merged. Should we create
> > >> RC2
> > >>> with that fix since the voting period only starts on Monday? I think
> > >>> it would only mean rerunning the scripts on your side, right?
> > >>>
> > >>> – Ufuk
> > >>>
> > >>>
> > >>> On Fri, Mar 24, 2017 at 3:05 PM, Robert Metzger <
> > >> rmetz...@apache.org>
> > >>> wrote:
> >  Dear Flink community,
> > 
> >  Please vote on releasing the following candidate as Apache Flink
> > >>> version 1.2
> >  .1.
> > 
> >  The commit to be voted on:
> >  *732e55bd* (*
> > >>> http://git-wip-us.apache.org/repos/asf/flink/commit/732e55bd
> >  *)
> > 
> >  Branch:
> >  release-1.2.1-rc1
> > 
> >  The release artifacts to be voted on can be found at:
> >  *http://people.apache.org/~rmetzger/flink-1.2.1-rc1/
> >  *
> > 
> >  The release artifacts are signed with the key with fingerprint
> > >>> D9839159:
> >  http://www.apache.org/dist/flink/KEYS
> > 
> >  The staging repository for this release can be found at:
> > 
> > >>> https://repository.apache.org/content/repositories/orgapacheflink-1116
> >  -
> > 
> > 
> >  The vote ends on Wednesday, March 29, 2017, 3pm CET.
> > 
> > 
> >  [ ] +1 Release this package as Apache Flink 1.2.1
> >  [ ] -1 Do not release this package, because ...
> > >>>
> > 


[jira] [Created] (FLINK-6221) Add Promethus support to metrics

2017-03-30 Thread Joshua Griffith (JIRA)
Joshua Griffith created FLINK-6221:
--

 Summary: Add Promethus support to metrics
 Key: FLINK-6221
 URL: https://issues.apache.org/jira/browse/FLINK-6221
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.2.0
Reporter: Joshua Griffith
Priority: Minor


[Prometheus|https://prometheus.io/] is becoming popular for metrics and 
alerting. It's possible to use 
[statsd-exporter|https://github.com/prometheus/statsd_exporter] to load Flink 
metrics into Prometheus but it would be far easier if Flink supported Promethus 
as a metrics reporter. A [dropwizard 
client|https://github.com/prometheus/client_java/tree/master/simpleclient_dropwizard]
 exists that could be integrated into the existing metrics system.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [Bug]Question about StreamExecutionEnvironment.createRemoteEnvironment

2017-03-30 Thread Till Rohrmann
Hi,

this is definitely a bug you're describing. The problem is that with
FLINK-2821 [1], we can no longer resolve the hostname on the JobManager.
Thus, you should specify the external hostname of the machine on which you
have started the JobManager (given that you've used the very same hostname
for the jobmanager.rpc.address configuration value.

Now the actual problem is that ExecutionEnvironment.createRemoteEnvironment
resolves the given hostname. This will lookup the hostname for the given
ip. In contrast, the StreamExecutionEnvironment.createRemoteEnvironment
won't do this. I think the latter should do the same to fix the problem.
Can you open a JIRA issue for the problem?

[1] https://issues.apache.org/jira/browse/FLINK-2821

Cheers,
Till

On Thu, Mar 30, 2017 at 2:41 PM, canbinzheng <2056268...@qq.com> wrote:

> I run two examples in the same client.
>
> first one use   /
> ExecutionEnvironment.createRemoteEnvironment("10.75.203.170", 59551)/
> second one use
> /StreamExecutionEnvironment.createRemoteEnvironment("10.75.203.170",
> 59551)/
>
> the first example run successfully, but the second example failed(connect
> to
> akka timeout), for the second one, if I change host parameter from ip to
> hostname, it works.
>
> I check the source code, and I found the ip would be translated into
> hostname automatically in the first example, but the second one don't.
>
> I am so confused if the ActorRef must use hostname as part of address, not
> ip? is it a bug?
>
>
>
>
> --
> View this message in context: http://apache-flink-mailing-
> list-archive.1008284.n3.nabble.com/Bug-Question-about-
> StreamExecutionEnvironment-createRemoteEnvironment-tp16854.html
> Sent from the Apache Flink Mailing List archive. mailing list archive at
> Nabble.com.
>


[Bug]Question about StreamExecutionEnvironment.createRemoteEnvironment

2017-03-30 Thread canbinzheng
I run two examples in the same client.

first one use   /
ExecutionEnvironment.createRemoteEnvironment("10.75.203.170", 59551)/
second one use   
/StreamExecutionEnvironment.createRemoteEnvironment("10.75.203.170", 59551)/

the first example run successfully, but the second example failed(connect to
akka timeout), for the second one, if I change host parameter from ip to
hostname, it works.

I check the source code, and I found the ip would be translated into
hostname automatically in the first example, but the second one don't.

I am so confused if the ActorRef must use hostname as part of address, not
ip? is it a bug?




--
View this message in context: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Bug-Question-about-StreamExecutionEnvironment-createRemoteEnvironment-tp16854.html
Sent from the Apache Flink Mailing List archive. mailing list archive at 
Nabble.com.


[jira] [Created] (FLINK-6219) Add a state backend which supports sorting

2017-03-30 Thread sunjincheng (JIRA)
sunjincheng created FLINK-6219:
--

 Summary: Add a state backend which supports sorting
 Key: FLINK-6219
 URL: https://issues.apache.org/jira/browse/FLINK-6219
 Project: Flink
  Issue Type: New Feature
  Components: State Backends, Checkpointing, Table API & SQL
Reporter: sunjincheng


When we implement the OVER window of 
[FLIP11|https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations]
We notice that we need a state backend which supports sorting, allows for 
efficient insertion, traversal in order, and removal from the head. 

For example: In event-time OVER window, we need to sort by time,If the datas as 
follow:
{code}
(1L, 1, Hello)
(2L, 2, Hello)
(5L, 5, Hello)
(4L, 4, Hello)
{code}
We randomly insert the datas, just like:
{code}
put((2L, 2, Hello)),put((1L, 1, Hello)),put((5L, 5, Hello)),put((4L, 4, Hello)),
{code}
We deal with elements in time order:
{code}
process((1L, 1, Hello)),process((2L, 2, Hello)),process((4L, 4, 
Hello)),process((5L, 5, Hello))
{code}
Welcome anyone to give feedback,And what do you think? [~xiaogang.shi] 
[~aljoscha] [~fhueske] 






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)