Re: Flink not giving full reason as to why job submission failed

2019-05-23 Thread Chesnay Schepler
Please open a new JIRA. FLINK-11902 modified the REST API to no longer 
hide the exception, but the WebUI isn't handling the error response 
properly (it only reads and displays part of it).


On 20/05/2019 16:24, Wouter Zorgdrager wrote:

Hi Harshith,

This is indeed an issue not resolved in 1.8. I added a comment to the 
(closed) Jira issue, so this might be fixed in further releases.


Cheers,
Wouter

Op ma 20 mei 2019 om 16:18 schreef Kumar Bolar, Harshith 
mailto:hk...@arity.com>>:


Hi Wouter,

I’ve upgraded Flink to 1.8, but now I only see Internal server
error on the dashboard when a job deployment fails.

But in the logs I see the correct exception -
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1

  at

functions.PersistPIDMessagesToCassandra.main(PersistPIDMessagesToCassandra.java:59)

  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

  at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

  at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

  at java.lang.reflect.Method.invoke(Method.java:498)

  at

org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)

  ... 9 more

Is there any way to show these errors on the dashboard? Because
many application teams deploy jobs through the dashboard and don’t
have ready access to the logs.

Thanks,

Harshith

*From: *Wouter Zorgdrager mailto:w.d.zorgdra...@tudelft.nl>>
*Date: *Thursday, 16 May 2019 at 7:56 PM
*To: *Harshith Kumar Bolar mailto:hk...@arity.com>>
*Cc: *user mailto:user@flink.apache.org>>
*Subject: *[External] Re: Flink not giving full reason as to why
job submission failed

Hi Harshith,

This was indeed an issue in 1.7.2, but fixed in 1.8.0. See the
corresponding Jira issue [1].

Cheers,

Wouter

[1]: https://issues.apache.org/jira/browse/FLINK-11902




Op do 16 mei 2019 om 16:05 schreef Kumar Bolar, Harshith
mailto:hk...@arity.com>>:

Hi all,

After upgrading Flink to 1.7.2, when I try to submit a job
from the dashboard and there's some issue with the job, the
job submission fails with the following error.

Exception occurred in REST handler:
org.apache.flink.client.program.ProgramInvocationException:
The main method caused an error.

There's no other reason given as to why the job failed to
submit. This was not the case in 1.4.2. Is there a way to see
the full reason why the job failed to deploy? I see the same
error in the logs too with no additional information.

Thanks,

Harshith





Re: Connectors (specifically Kinesis Connector)

2019-05-23 Thread Tzu-Li (Gordon) Tai
Hi Steven,

I assume you are referring to the problem that we don't publish the Kinesis
connector artifacts to Maven, due to the licensing issue with KCL?
I didn't manage to find any JIRAs that were addressing this issue directly,
but the most related one would be this:
https://issues.apache.org/jira/browse/FLINK-3924.

Cheers,
Gordon

On Tue, May 21, 2019 at 10:24 PM Steven Nelson 
wrote:

> Hello!
>
> We keep having difficulties with the Kinesis connector. We have to publish
> our own version, and we understand why. What I am curious about is the plan
> to make this better in the future. Is there an issue/FLIP that I can
> reference when talking internally about this?
>
> -Steve
>


Count Window Trigger that only fires once

2019-05-23 Thread Frank Wilson
Hi,

Is there a way to make the count window trigger fire only once? I would
like a session window to only emit the first element it receives
immediately rather than waiting until the watermark passes the end of the
window.

Thanks,

Frank


How many task managers to launch for a job?

2019-05-23 Thread black chase
Hi,

I am redesigning the scheduler of the JobManager to place tasks of a job
across TaskManagers accroding to a scheduling policy.

I am reading the Flip-6 proposal and found that the common case is "one
TaskManager launchs one slot", and "one Flink cluster serves one job". But
I did not find how many TaskManagers to launch in a computing node. Is
there any common practice for this ?

-- 
Best Regards!
Pengcheng Duan


Re: [SURVEY] Usage of flink-ml and [DISCUSS] Delete flink-ml

2019-05-23 Thread Rong Rong
+1 for the deletion.

Also I think it also might be a good idea to update the roadmap for the
plan of removal/development since we've reached the consensus on FLIP-39.

Thanks,
Rong


On Wed, May 22, 2019 at 8:26 AM Shaoxuan Wang  wrote:

> Hi Chesnay,
> Yes, you are right. There is not any active commit planned for the legacy
> Flink-ml package. It does not matter delete it now or later. I will open a
> PR and remove it.
>
> Shaoxuan
>
> On Wed, May 22, 2019 at 7:05 PM Chesnay Schepler 
> wrote:
>
>> I believe we can remove it regardless since users could just use the 1.8
>> version against future releases.
>>
>> Generally speaking, any library/connector that is no longer actively
>> developed can be removed from the project as existing users can always
>> rely on previous versions, which should continue to work by virtue of
>> working against @Stable APIs.
>>
>> On 22/05/2019 12:08, Shaoxuan Wang wrote:
>> > Hi Flink community,
>> >
>> > We plan to delete/deprecate the legacy flink-libraries/flink-ml package
>> in
>> > Flink1.9, and replace it with the new flink-ml interface proposed in
>> FLIP39
>> > (FLINK-12470).
>> > Before we remove this package, I want to reach out to you and ask if
>> there
>> > is any active project still uses this package. Please respond to this
>> > thread and outline how you use flink-libraries/flink-ml.
>> > Depending on the replies of activity and adoption
>> > of flink-libraries/flink-ml, we will decide to either delete this
>> package
>> > in Flink1.9 or deprecate it for now & remove it in the next release
>> after
>> > 1.9.
>> >
>> > Thanks for your attention and help!
>> >
>> > Regards,
>> > Shaoxuan
>> >
>>
>>


How many task managers can a cluster reasonably handle?

2019-05-23 Thread Antonio Verardi
Hello Flink users,

How many task managers one can expect a Flink cluster to be able to
reasonably handle?

I want to move a pretty big cluster from a setup on AWS EMR to one based on
Kubernetes. I was wondering whether it makes sense to break up the beefy
task managers the cluster had in something like 150 task manager containers
of a slot each. This is a pattern that a couple different people I met at
meetups told me they are using in production, but I don't know if they
tried something similar at this scale. Would the jobmanager be able to
manage so many task managers in your opinion?

Cheers,
Antonio


Upgrading from 1.4 to 1.8, losing Kafka consumer state

2019-05-23 Thread Nikolas Davis
Howdy,

We're in the process of upgrading to 1.8. When restoring state to the new
cluster (using a savepoint) we are seeing our Kafka consumers restart from
the earliest offset. We're not receiving any other indication that our
state was not accepted as part of the deploy, e.g. we are not allowing
unrestored state, not receiving any errors.

We have our consumers setup with the same consumer group and using the same
consumer (FlinkKafkaConsumer010) as our 1.4 deploy.

Has anyone encountered this? Any idea what we might be doing wrong?

What's also strange is that we are not setting auto.offset.reset, which
defaults to is largest (analogous to latest, correct?) -- which is not what
we're seeing happen.

Regards,

Nik


Re: Count Window Trigger that only fires once

2019-05-23 Thread Congxian Qiu
Hi Frank,

Seems you want a custom trigger, maybe the doc[1] can help.

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/windows.html#triggers

Best Congxian
On May 23, 2019, 23:38 +0800, Frank Wilson , wrote:
> Hi,
>
> Is there a way to make the count window trigger fire only once? I would like 
> a session window to only emit the first element it receives immediately 
> rather than waiting until the watermark passes the end of the window.
>
> Thanks,
>
> Frank


Re: How many task managers to launch for a job?

2019-05-23 Thread Xintong Song
Hi black,

If you are running Flink on Yarn or Mesos, Flink will automatically
allocate resource and launch new TaskManagers as needed.

If you are using Flink standalone mode, then the easiest way is to enable
slot sharing and set all the vertices into the same group (which is by
default). In that way, the total slots (or number of TaskManagers if you
config on slot for each TaskManager) needed for running the job would be
the maximum parallelism of the job graph vertices. Further information on
slot sharing could be found here

.

Thank you~

Xintong Song



On Thu, May 23, 2019 at 11:49 PM black chase 
wrote:

>
> Hi,
>
> I am redesigning the scheduler of the JobManager to place tasks of a job
> across TaskManagers accroding to a scheduling policy.
>
> I am reading the Flip-6 proposal and found that the common case is "one
> TaskManager launchs one slot", and "one Flink cluster serves one job". But
> I did not find how many TaskManagers to launch in a computing node. Is
> there any common practice for this ?
>
> --
> Best Regards!
> Pengcheng Duan
>


Re: How many task managers can a cluster reasonably handle?

2019-05-23 Thread Xintong Song
Hi Antonio,

According to experience in our production, Flink totally can handle 150
TaskManagers per cluster. Actually, we have encountered much larger jobs
with thousands that each single job demands thousands of TaskManagers.
However, as the job scale increases, it gets harder to achieve good
stability. Because there are more tasks, thus higher chance of job failover
(or region failover if possible) caused by a single task failure. So if you
don't have jobs as large as that scale, I think 150 TaskManagers per
cluster would be a good choice.

In case you do encounter a JobManager performance bottleneck, usually it
can be solved by increasing the JobManager's resources with a '-jm'
argument.

Thank you~

Xintong Song



On Fri, May 24, 2019 at 2:33 AM Antonio Verardi  wrote:

> Hello Flink users,
>
> How many task managers one can expect a Flink cluster to be able to
> reasonably handle?
>
> I want to move a pretty big cluster from a setup on AWS EMR to one based
> on Kubernetes. I was wondering whether it makes sense to break up the beefy
> task managers the cluster had in something like 150 task manager containers
> of a slot each. This is a pattern that a couple different people I met at
> meetups told me they are using in production, but I don't know if they
> tried something similar at this scale. Would the jobmanager be able to
> manage so many task managers in your opinion?
>
> Cheers,
> Antonio
>