Re: Migrating createTemporaryView to new Table api.

2021-10-14 Thread Niels Basjes
StreamTableEnvironment#fromDataStream [1] > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.14/api/java/org/apache/flink/table/api/bridge/java/StreamTableEnvironment.html#fromDataStream-org.apache.flink.streaming.api.datastream.DataStream-org.apache.flink.table.api.Schema- > > Niels Basjes 于2

Migrating createTemporaryView to new Table api.

2021-10-13 Thread Niels Basjes
ect I'm doing something wrong regarding the mentioned "generic raw type" and the way I'm trying to define the Schema. What I essentially am looking for is the correct way to give the 3 provided columns a new name and type. How do I do this correctly in the new API? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Flink to BigTable

2021-01-24 Thread Niels Basjes
Hi, I haven't tried it myself yet but there is a Flink connector for HBase and I remember someone telling me that Google has made a library available which is effectively the HBase client which talks to BigTable in the backend. Like I said: I haven't tried this yet myself. Niels Bas

Re: [ANNOUNCE] Apache Flink 1.12.0 released

2020-12-11 Thread Niels Basjes
mmunity who > made this release possible! > > Regards, > Dian & Robert > > -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: [Flink Unit Tests] Unit test for Flink streaming codes

2020-08-03 Thread Niels Basjes
he data and put the logic under >>> test in the middle. That may be a part of your pipeline or even the whole >>> pipeline. >>> >>> If you want to have some scala inspiration, have a look at: >>> >>> https://github.com/apache/flink/blob/5f0183fe79d10ac36101f60f2

Re: [Flink Unit Tests] Unit test for Flink streaming codes

2020-08-01 Thread Niels Basjes
b.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/scala/org/apache/flink/streaming/scala/examples/socket/SocketWindowWordCount.scala > > Regards, > Vijay > > On Fri, Jul 31, 2020 at 12:22 PM Niels Basjes wrote: > >> Does this test in one of m

Re: [Flink Unit Tests] Unit test for Flink streaming codes

2020-07-31 Thread Niels Basjes
Does this test in one of my own projects do what you are looking for? https://github.com/nielsbasjes/yauaa/blob/1e1ceb85c507134614186e3e60952112a2daabff/udfs/flink/src/test/java/nl/basjes/parse/useragent/flink/TestUserAgentAnalysisMapperClass.java#L107 On Fri, 31 Jul 2020, 20:20 Vijayendra Yadav

Re: Handle idle kafka source in Flink 1.9

2020-07-22 Thread Niels Basjes
Have a look at this presentation I gave a few weeks ago. https://youtu.be/bQmz7JOmE_4 Niels Basjes On Wed, 22 Jul 2020, 08:51 bat man, wrote: > Hi Team, > > Can someone share their experiences handling this. > > Thanks. > > On Tue, Jul 21, 2020 at 11:30 AM bat man wrote

Re: Chaining the creation of a WatermarkStrategy doesn't work?

2020-07-08 Thread Niels Basjes
-mailing-list-archive.2336050.n4.nabble.com/ > -- Best regards / Met vriendelijke groeten, Niels Basjes

Chaining the creation of a WatermarkStrategy doesn't work?

2020-07-07 Thread Niels Basjes
Why is that? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: [PROPOSAL] Contribute training materials to Apache Flink

2020-04-09 Thread Niels Basjes
, and to add the > exercises to flink-playgrounds -- but these points can be discussed > separately once we've established that the community wants this content. > > Looking forward to hearing what you think! > > Best regards, > David > -- Best regards / Met vriendelijke groeten, Niels Basjes

Deploying native Kubernetes via yaml files?

2020-03-24 Thread Niels Basjes
s / Met vriendelijke groeten, Niels Basjes

Re: [Kubernetes] java.lang.OutOfMemoryError: Metaspace ?

2020-03-02 Thread Niels Basjes
limit via >> "-XX:MaxMetaspaceSize" >> by default. The default value is 96m, loading too many classes will cause >> "OutOfMemoryError: Metaspace"[1]. So you need to increase the configured >> value. >> >> >> [1]. >> https://ci.apache

[Kubernetes] java.lang.OutOfMemoryError: Metaspace ?

2020-03-02 Thread Niels Basjes
Hi, I'm running a lot of batch jobs on Kubernetes once in a while I get this exception. What is causing this? How can I fix this? Niels Basjes java.lang.OutOfMemoryError: Metaspace at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader

Re: Giving useful names to the SQL steps/operators.

2020-03-01 Thread Niels Basjes
Thanks. On Sat, Feb 29, 2020 at 4:20 PM Yuval Itzchakov wrote: > > Unfortunately, it isn't possible. You can't set names to steps like > ordinary Java/Scala functions. > > On Sat, 29 Feb 2020, 17:11 Niels Basjes, wrote: > >> Hi, >> >> I

Writing a DataSet to ElasticSearch

2020-03-01 Thread Niels Basjes
ernative I came up with is to write the output of my batch to a file and then load that (with a stream) into ES. What is the proper solution? Is there an OutputFormat for ES I can use that I overlooked? -- Best regards / Met vriendelijke groeten, Niels Basjes

Giving useful names to the SQL steps/operators.

2020-02-29 Thread Niels Basjes
NetworkType, clicks, visitors) exceeded the 80 characters length limit and was truncated. As you can see this impacts not only the names of the steps but also the metrics. My question if it is possible to specify a name for the step, similar to what I can do in the Java code? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: [Native Kubernetes] [Ceph S3] Unable to load credentials from service endpoint.

2020-02-28 Thread Niels Basjes
now ALL jobs in this Flink cluster have the same credentials. Is there a way to set the S3 credentials on a per job or even per connection basis? Niels Basjes On Fri, Feb 28, 2020 at 4:38 AM Yang Wang wrote: > Hi Niels, > > Glad to hear that you are trying Flink native K8s integration

[Native Kubernetes] [Ceph S3] Unable to load credentials from service endpoint.

2020-02-27 Thread Niels Basjes
setup. [default] access_key = myAccessKey secret_key = mySecretKey host_base = s3.example.nl *I'm stuck, please help:* - What is causing the differences in behaviour between local and in k8s? It works locally but not in the cluster. - How do I figure out what network it is trying to reach in k8s? Thanks. -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: [Flink 1.10] How do I use LocalCollectionOutputFormat now that writeUsingOutputFormat is deprecated?

2020-02-22 Thread Niels Basjes
o the "no operators defined" error. > However, if you have collect(), print(), execute(), then the print() is > filling the stream graph again, and you are executing two Flink jobs: the > collect job and the execute job. > > I hope I got it right this time :) > > Best, >

Re: [Flink 1.10] How do I use LocalCollectionOutputFormat now that writeUsingOutputFormat is deprecated?

2020-02-21 Thread Niels Basjes
gentAnalysisMapperInline class is doing some magic > that breaks with the StreamGraphGenerator? > > Best, > Robert > > On Tue, Feb 18, 2020 at 9:59 AM Niels Basjes wrote: > >> Hi Gordon, >> >> Thanks. This works for me. >> >> I find it strange tha

Re: [Flink 1.10] How do I use LocalCollectionOutputFormat now that writeUsingOutputFormat is deprecated?

2020-02-18 Thread Niels Basjes
finitionDataStream(TestUserAgentAnalysisMapperInline.java:144) Did I do something wrong? Is this a bug in the DataStreamUtils ? Niels Basjes On Mon, Feb 17, 2020 at 8:56 AM Tzu-Li Tai wrote: > Hi, > > To collect the elements of a DataStream (usually only meant for testing > pur

[Flink 1.10] How do I use LocalCollectionOutputFormat now that writeUsingOutputFormat is deprecated?

2020-02-14 Thread Niels Basjes
t regards / Met vriendelijke groeten, Niels Basjes

Re: Flink SQL: How to tag a column as 'rowtime' from a Avro based DataStream?

2019-09-10 Thread Niels Basjes
e property (and thereby the > TimeIndicatorTypeInfo) and you wouldn't know to fiddle with the output > types. > > Best, Fabian > > Am Mi., 21. Aug. 2019 um 10:51 Uhr schrieb Niels Basjes : > >> Hi, >> >> It has taken me quite a bit of time to figure this o

Re: Flink SQL: How to tag a column as 'rowtime' from a Avro based DataStream?

2019-08-21 Thread Niels Basjes
upleType = new RowTypeInfo(fieldTypes); DataStream resultSet = tableEnv.toAppendStream(resultTable, tupleType); Which gives me the desired DataStream. Niels Basjes On Wed, Aug 14, 2019 at 5:13 PM Timo Walther wrote: > Hi Niels, > > if you are coming from DataStream

Flink SQL: How to tag a column as 'rowtime' from a Avro based DataStream?

2019-08-14 Thread Niels Basjes
that the timestamp column show be treated as the rowtime. How do I do that? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Is the provided Serializer/TypeInformation checked "too late"?

2019-07-09 Thread Niels Basjes
nt: We are aware of these confusions and the Table > & SQL API will hopefully not use the TypeExtractor anymore in 1.10. This > is what I am working on at the moment. > > Regards, > Timo > > [0] > > https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/types_s

Is the provided Serializer/TypeInformation checked "too late"?

2019-07-08 Thread Niels Basjes
th mentioned examples the correct serialization classes when running. So what is happening here? Did I forget to do a required call? So is this a bug? Is the provided serialization via TypeInformation 'skipped' during startup and only used during runtime? -- Best regards / Met vriendelijke groeten, Niels Basjes

BigQuery source ?

2019-05-31 Thread Niels Basjes
have not been able to find anything yet. Any pointers/hints/code fragments are welcome. Thanks -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-11 Thread Niels Basjes
Hi, The Beam project has something in this area that is simply a page within their documentation website: https://beam.apache.org/documentation/sdks/java-thirdparty/ Niels Basjes On Fri, Mar 8, 2019 at 11:39 PM Bowen Li wrote: > > Confluent hub for Kafka is another good example of this k

Re: [DISCUSS] Dropping flink-storm?

2018-09-29 Thread Niels Basjes
I would drop it. Niels Basjes On Sat, 29 Sep 2018, 10:38 Kostas Kloudas, wrote: > +1 to drop it as nobody seems to be willing to maintain it and it also > stands in the way for future developments in Flink. > > Cheers, > Kostas > > > On Sep 29, 2018, at 8:19 AM, Tzu-Li

Re: Order of events in a Keyed Stream

2018-07-29 Thread Niels Basjes
. Niels Basjes On Sun, Jul 29, 2018 at 9:25 AM, Congxian Qiu wrote: > Hi, > Maybe the messages of the same key should be in the *same partition* of > Kafka topic > > 2018-07-29 11:01 GMT+08:00 Hequn Cheng : > >> Hi harshvardhan, >> If 1.the messages exist on the

Re: HBase config settings go missing within Yarn.

2017-10-26 Thread Niels Basjes
ltToOutType(Result result) { > return new > String(result.getFamilyMap("v".getBytes(UTF_8)).get("column".getBytes(UTF_8))); > } > > @Override > protected Scan getScanner() { > return new Scan(); > } > } > } > >

Re: HBase config settings go missing within Yarn.

2017-10-24 Thread Niels Basjes
roach. Niels Basjes On Tue, Oct 24, 2017 at 11:29 AM, Niels Basjes wrote: > Minor correction: The HBase jar files are on the classpath, just in a > different order. > > On Tue, Oct 24, 2017 at 11:18 AM, Niels Basjes wrote: > >> I did some more digging. >> >>

Re: HBase config settings go missing within Yarn.

2017-10-24 Thread Niels Basjes
Minor correction: The HBase jar files are on the classpath, just in a different order. On Tue, Oct 24, 2017 at 11:18 AM, Niels Basjes wrote: > I did some more digging. > > I added extra code to print both the environment variables and the > classpath that is used by the HBaseConf

Re: HBase config settings go missing within Yarn.

2017-10-24 Thread Niels Basjes
y needs the HBase client (Jar, packaged into application) and the HBase zookeeper settings (present on the machine where it is started). Niels Basjes On Mon, Oct 23, 2017 at 10:23 AM, Piotr Nowojski wrote: > Till do you have some idea what is going on? I do not see any meaningful >

Re: HBase config settings go missing within Yarn.

2017-10-22 Thread Niels Basjes
://github.com/nielsbasjes/FlinkHBaseConnectProblem Niels Basjes On Fri, Oct 20, 2017 at 6:54 PM, Piotr Nowojski wrote: > Is this /etc/hbase/conf/hbase-site.xml file is present on all of the > machines? If yes, could you share your code? > > On 20 Oct 2017, at 16:29, Niels Basjes

Re: HBase config settings go missing within Yarn.

2017-10-20 Thread Niels Basjes
ot;file:/etc/hbase/conf/hbase-site.xml")); > > ? > > To me it seems like it is a problem with misconfigured HBase and not > something related to Flink. > > Piotrek > > On 20 Oct 2017, at 13:44, Niels Basjes wrote: > > To facilitate you guys helping me I put thi

Re: HBase config settings go missing within Yarn.

2017-10-20 Thread Niels Basjes
To facilitate you guys helping me I put this test project on github: https://github.com/nielsbasjes/FlinkHBaseConnectProblem Niels Basjes On Fri, Oct 20, 2017 at 1:32 PM, Niels Basjes wrote: > Hi, > > Ik have a Flink 1.3.2 application that I want to run on a Hadoop yarn > cluster

HBase config settings go missing within Yarn.

2017-10-20 Thread Niels Basjes
s. As a workaround I currently put this extra line in my code which I know is nasty but "works on my cluster" hbaseConfig.addResource(new Path("file:/etc/hbase/conf/hbase-site.xml")); What am I doing wrong? What is the right way to fix this? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: PartitionNotFoundException when running in yarn-session.

2017-10-13 Thread Niels Basjes
lease report back when you have more info :-) > > – Ufuk > > [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP- > 1+%3A+Fine+Grained+Recovery+from+Task+Failures > > [2] https://issues.apache.org/jira/browse/FLINK-4256 > > On Thu, Oct 12, 2017 at 10:17 AM, Niels Basjes wro

Re: PartitionNotFoundException when running in yarn-session.

2017-10-12 Thread Niels Basjes
gt; intermediate result multiple times with timed backoff [1] and only > > fail the request (your stack trace) if the partition is still not > > ready although we expect it to be ready (that is there was no failure > > at the producing task). > > > > [1] Startin

PartitionNotFoundException when running in yarn-session.

2017-10-09 Thread Niels Basjes
.2? Thanks. -- Best regards / Met vriendelijke groeten, Niels Basjes

Re:Re: Just do a survey, how many people give up the storm and turn to Flink ?

2017-08-20 Thread Niels Basjes
. Yes, storm does not support stateful > processing components. So, I have to use something like Redis to store it's > stateful. > > > > > > At 2017-08-19 16:57:13, "Niels Basjes" wrote: > > Hi, > > The company I work for switched about 2 years a

Re: Just do a survey, how many people give up the storm and turn to Flink ?

2017-08-19 Thread Niels Basjes
Hi, The company I work for switched about 2 years ago because of these reasons AT THAT moment! 1) Storm doesn't run on Yarn 2) Storm doesn't support statefull processing components. 3) Storm has a bad Java api. 4) Storm is not fast enough. Some of these things have changed over the last 2 years.

Re: [POLL] Who still uses Java 7 with Flink ?

2017-07-13 Thread Niels Basjes
+1 For dropping java 1.7 On 13 Jul 2017 04:11, "Jark Wu" wrote: > +1 for dropping Java 7 > > 2017-07-13 9:34 GMT+08:00 ☼ R Nair (रविशंकर नायर) < > ravishankar.n...@gmail.com>: > >> +1 for dropping Java 1.7. >> >> On Wed, Jul 12, 2017 at 9:10 PM, Kurt Young wrote: >> >>> +1 for droppint Java 7,

Re: Writing groups of Windows to files

2017-07-04 Thread Niels Basjes
ords of a session are emitted by a single WIndowFunction > call, these records won't be interrupted by a barrier. Hence, you'll have a > "consistent" state for all windows when a checkpoint is triggered. > > I'm afraid, I'm not aware of a simpler solution

Re: Writing groups of Windows to files

2017-07-04 Thread Niels Basjes
Hi Fabian, On Fri, Jun 30, 2017 at 6:27 PM, Fabian Hueske wrote: > If I understand your use case correctly, you'd like to hold back all > events of a session until it ends/timesout and then write all events out. > So, instead of aggregating per session (the common use case), you'd just > like to

Writing groups of Windows to files

2017-06-30 Thread Niels Basjes
rds / Met vriendelijke groeten, Niels Basjes

Re: Periodic flush sink?

2017-04-29 Thread Niels Basjes
any small hfiles, leading to more work for the compaction. > > FYI > > On Sat, Apr 29, 2017 at 7:32 AM, Niels Basjes wrote: > >> Hi, >> >> I have a sink that writes my records into HBase. >> >> The data stream is attached to measurements from an in

Periodic flush sink?

2017-04-29 Thread Niels Basjes
he buffers atleast every few seconds. Simply implement a standard Java TimerTask and fire that using a Timer? Or is there a better way of doing that in Flink? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Flink job on secure Yarn fails after many hours

2017-04-14 Thread Niels Basjes
nMaster and >> resubmits the job? >> >> Thanks, >> Max >> >> >> On Thu, Mar 17, 2016 at 12:43 PM, Niels Basjes wrote: >> > Hi, >> > >> > In my environment doing the "proxy" thing didn't work. >> > With an to

Re: Streaming file source?

2017-01-20 Thread Niels Basjes
inistic. > > Stephan > > > On Fri, Jan 20, 2017 at 11:20 AM, Niels Basjes wrote: > >> Hi, >> >> For testing and optimizing a streaming application I want to have a "100% >> accurate repeatable" substitute for a Kafka source. >> I was thinking o

Streaming file source?

2017-01-20 Thread Niels Basjes
data which (in the live situation) come from a single Kafka partition. I hate reinventing the wheel so I'm wondering is something like this already been built by someone? If so, where can I find it? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Kafka KeyedStream source

2017-01-18 Thread Niels Basjes
t after, and then a keyBy followed with your >> heavy-processing, key-wise computations. >> Does that makes sense for what you have in mind? >> >> Cheers, >> Gordon >> >> On January 11, 2017 at 4:11:26 PM, Niels Basjes (ni...@basjes.nl) wrote: >> >&g

Re: Kafka KeyedStream source

2017-01-11 Thread Niels Basjes
to adapt). > > Cheers, > Gordon > > [1] http://apache-flink-mailing-list-archive.1008284. > n3.nabble.com/kafka-partition-assignment-td12123.html > > On January 6, 2017 at 1:38:05 AM, Niels Basjes (ni...@basjes.nl) wrote: > > Hi, > > In my scenario I have click strea

Kafka KeyedStream source

2017-01-05 Thread Niels Basjes
produces a keyed data stream? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: microsecond resolution

2016-12-06 Thread Niels Basjes
you MUST make sure you do the same with things like watermarks. And if you want to have a watermark that is 5 seconds before the current time stamp you must be sure to substract 500 instead of 5000 fom the timestamp. Niels Basjes On Mon, Dec 5, 2016 at 2:48 PM, jeff jacobson

Re: Flink Avro Kafka Reading/Writing

2016-11-12 Thread Niels Basjes
consumer side. See: https://issues.apache.org/jira/browse/AVRO-1704 https://github.com/apache/avro/blob/master/lang/java/ipc/src/test/java/org/apache/avro/message/TestCustomSchemaStore.java Niels Basjes On Fri, Nov 11, 2016 at 3:05 PM, daviD wrote: > Hi All, > > Does anyone know if Flink

Re: Unit testing a Kafka stream based application?

2016-10-27 Thread Niels Basjes
t/java/org/ > apache/flink/streaming/connectors/kafka/KafkaConsumerTestBase.java#L1408 > > I hope you can find the right code lines to copy for your purposes. > > Regards, > Robert > > > On Fri, Oct 21, 2016 at 4:00 PM, Niels Basjes wrote: > >> Hi, >> >> In addition to hav

Unit testing a Kafka stream based application?

2016-10-21 Thread Niels Basjes
. -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Delaying starting the jobmanager in yarn?

2016-08-26 Thread Niels Basjes
riptor.setLocalJarPath(new Path(flinkJarPath)); > descriptor.setTaskManagerCount(2); > descriptor.setName("Testing the YarnClusterClient"); > > final YarnClusterClient client = descriptor.deploy(); > client.run(packagedProgram, 2); > clie

Re: Running multiple jobs on yarn (without yarn-session)

2016-08-25 Thread Niels Basjes
owse/FLINK-4495 for you Niels Basjes On Thu, Aug 25, 2016 at 3:34 PM, Maximilian Michels wrote: > Hi Niels, > > This is with 1.1.1? We could fix this in the upcoming 1.1.2 release by > only using automatic shut down for detached jobs. In all other cases > we should be able to shut

Re: Delaying starting the jobmanager in yarn?

2016-08-25 Thread Niels Basjes
t; configuration and subsequently call `deploy()` on it to receive a > ClusterClient for Yarn which you can submit programs using the > `run(PackagedProgram program, String args)` method. You can also > cancel jobs or shutdown the cluster from the ClusterClient. > > Cheers, > Max > > On

Running multiple jobs on yarn (without yarn-session)

2016-08-25 Thread Niels Basjes
on it works but then I have the troubles of starting a (detached yarn-session) AND to terminate that thing again after my jobs have run. -- Best regards / Met vriendelijke groeten, Niels Basjes

Delaying starting the jobmanager in yarn?

2016-08-25 Thread Niels Basjes
o run? Can we 'manually' start and stop the jobmanager in yarn in some way from our java code? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Batch jobs with a very large number of input splits

2016-08-23 Thread Niels Basjes
jobs in yarn-session then you MUST specify the parallelism for all steps or otherwise it will fill the yarn-session completely and you cannot run multiple jobs in parallel. Is this conclusion correct? Niels Basjes On Fri, Aug 19, 2016 at 3:18 PM, Robert Metzger wrote: > Hi Niels, > >

Re: Checking for existance of output directory/files before running a batch job

2016-08-22 Thread Niels Basjes
hdfs:///path/to/foo >> >> If that doesn't work, do you have the same Hadoop configuration on the >> machine where you test? >> >> Cheers, >> Max >> >> On Thu, Aug 18, 2016 at 2:02 PM, Niels Basjes wrote: >> > Hi, >> > >> >

Checking for existance of output directory/files before running a batch job

2016-08-18 Thread Niels Basjes
onment yet I was unable to get the 'correct' filesystem from there. What is the proper way to check this? -- Best regards / Met vriendelijke groeten, Niels Basjes

Batch jobs with a very large number of input splits

2016-08-18 Thread Niels Basjes
vriendelijke groeten, Niels Basjes

Re: Running yarn-session a kerberos secured Yarn/HBase cluster.

2016-08-01 Thread Niels Basjes
https://github.com/apache/flink/pull/2317 On Mon, Aug 1, 2016 at 11:54 AM, Niels Basjes wrote: > Thanks for the pointers towards the work you are doing here. > I'll put up a patch for the jars and such in the next few days. > https://issues.apache.org/jira/browse/FLINK-4287 &g

Re: Running yarn-session a kerberos secured Yarn/HBase cluster.

2016-08-01 Thread Niels Basjes
Thanks for the pointers towards the work you are doing here. I'll put up a patch for the jars and such in the next few days. https://issues.apache.org/jira/browse/FLINK-4287 Niels Basjes On Mon, Aug 1, 2016 at 11:46 AM, Stephan Ewen wrote: > Thank you for the breakdown of the

Running yarn-session a kerberos secured Yarn/HBase cluster.

2016-08-01 Thread Niels Basjes
all long running jobs) I would really like to have this to be a 'long lived' thing. As far as I know this is just the tip of the security ice berg and I would like to know what the correct approach is to solve this. Thanks. -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Debugging watermarks?

2016-05-26 Thread Niels Basjes
Thanks guys, Using the above code as a reference I was quickly able to find the problems in my code. Niels Basjes On Sun, May 22, 2016 at 2:00 PM, Stephan Ewen wrote: > Hi Niels! > > It may also be interesting for you to know that with the extension of the > metrics and the

Debugging watermarks?

2016-05-21 Thread Niels Basjes
ctual data. Thanks. Niels Basjes

Re: throttled stream

2016-04-16 Thread Niels Basjes
Simple idea: create a map function that only does "sleep 1/5 second" and put that in your pipeline somewhere. Niels On 16 Apr 2016 22:38, "Chen Bekor" wrote: > is there a way to consume a kafka stream using flink with a predefined > rate limit (eg 5 events per second) > > we need this because w

Re: Flink job on secure Yarn fails after many hours

2016-03-19 Thread Niels Basjes
fails once in a while and have an automatic restart feature (i.e. shell script with a while true loop). The best guess at a root cause is this https://issues.apache.org/jira/browse/HDFS-9276 If you have a real solution or a reference to a related bug report to this problem then please share! Nie

Re: Stack overflow from self referencing Avro schema

2016-03-10 Thread Niels Basjes
/jira/browse/AVRO/ Thanks Niels Basjes On Thu, Mar 10, 2016 at 4:11 PM, David Kim wrote: > Hello! > > Just wanted to check up on this again. Has anyone else seen this before or > have any suggestions? > > Thanks! > David > > On Tue, Mar 8, 2016 at 12

Re: How to ensure exactly-once semantics in output to Kafka?

2016-02-05 Thread Niels Basjes
Skip Get message 8 -> Read from Kafka --> Already have this --> Skip Get message 9 -> Read from Kafka --> Not yet in Kafka --> Write and resume normal operations. Like I said: This is just the first rough idea I had on a possible direction how this can be solved without the latency imp

Re: How to ensure exactly-once semantics in output to Kafka?

2016-02-05 Thread Niels Basjes
isturbance. What do you think? Niels Basjes On Fri, Feb 5, 2016 at 11:57 AM, Stephan Ewen wrote: > Hi Niels! > > In general, exactly once output requires transactional cooperation from > the target system. Kafka has that on the roadmap, we should be able to > integrate that onc

How to ensure exactly-once semantics in output to Kafka?

2016-02-05 Thread Niels Basjes
) each message that is read from Kafka (my input) is written to Kafka (my output) exactly once? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Comparison of storm and flink

2016-01-24 Thread Niels Basjes
g situation. As a final note: I've been hacking at Storm for over a year now and last summer I found Flink. Today Storm is for me no longer an option and we are taking down what we already had running. Niels Basjes On 23 Jan 2016 20:59, "Vinaya M S" wrote: > Hi Flink user grou

Re: Redeployements and state

2016-01-22 Thread Niels Basjes
he configured checkpoint persistance and recovers the most recent one. Apparently there is a mismatch between what I think is useful and what has been implemented so far. Am I missing something or should I submit this as a Jira ticket for a later version? Niels Basjes On Mon, Jan 18, 2016 at 12:

Re: Redeployements and state

2016-01-14 Thread Niels Basjes
robably looking for this feature: > > https://issues.apache.org/jira/browse/FLINK-2976 > > > > Best, > > Gábor > > > > > > > > > > 2016-01-14 11:05 GMT+01:00 Niels Basjes : > >> Hi, > >> > >> I'm working on a stre

Redeployements and state

2016-01-14 Thread Niels Basjes
ution will be in some part be specific for my application. The question is what features exist in Flink to support such a clean "continue where I left of" scenario? -- Best regards / Met vriendelijke groeten, Niels Basjes

How do failovers work on yarn?

2015-12-21 Thread Niels Basjes
manager to a different node. How will I be able to submit a job after that happened? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Tiny topology shows '0' for all stats.

2015-12-16 Thread Niels Basjes
n, out of Streaming sources and sinks? > > On Tue, Dec 15, 2015 at 5:24 AM, Niels Basjes wrote: > >> Hi, >> >> @Ufuk: I added the env.disableOperatorChaining() and indeed now I see two >> things on the screen and there are numbers counting what has happened. >&g

Re: Tiny topology shows '0' for all stats.

2015-12-15 Thread Niels Basjes
/jira/browse/FLINK-2944 https://issues.apache.org/jira/browse/FLINK-3130 Niels On Mon, Dec 14, 2015 at 5:03 PM, Ufuk Celebi wrote: > > > On 14 Dec 2015, at 16:25, Niels Basjes wrote: > > > > Hi, > > > > I have a very small topology here. > > In fact t

Tiny topology shows '0' for all stats.

2015-12-14 Thread Niels Basjes
cords arriving into Kafka. Is this a bug in Flink or am I misinterpreting the meaning of these numbers? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Getting two types of events from a Window (Trigger)?

2015-12-11 Thread Niels Basjes
>> in each parallel instance of the source operator. And there is no way for >> there being communication between the trigger and source, since they might >> now even run on the same machine in the end. >> >> Cheers, >> Aljoscha >> > On 11 Dec 2015, at 13:11, N

Re: Getting two types of events from a Window (Trigger)?

2015-12-11 Thread Niels Basjes
ince they might > now even run on the same machine in the end. > > Cheers, > Aljoscha > > On 11 Dec 2015, at 13:11, Niels Basjes wrote: > > > > Hi, > > > > Just to let you know: I tried passing a SourceFunction but I haven't > been able to get t

Re: Getting two types of events from a Window (Trigger)?

2015-12-11 Thread Niels Basjes
mestamp; queue.add(queueElement); } } class QueueElement { String element; long timestamp; } On Fri, Dec 11, 2015 at 11:07 AM, Niels Basjes wrote: > Thanks. > > The way I solved it now is by creating a class that persists data into > something external (righ

Re: Getting two types of events from a Window (Trigger)?

2015-12-11 Thread Niels Basjes
on with Aljoscha, one way to make this more flexible is to > enhance what you can do with custom state: > - State has timeouts (for cleanup) > - Functions allow you to schedule event-time progress notifications > > Stephan > > > > On Thu, Dec 10, 2015 at 11:55 AM, Niels B

Getting two types of events from a Window (Trigger)?

2015-12-10 Thread Niels Basjes
ecial 'Source' that I can pass as a parameter to my Trigger and then onEventTime just output a 'new event' ? What do you recommend? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Flink job on secure Yarn fails after many hours

2015-12-04 Thread Niels Basjes
b3 > >> > >> If you already have the Flink repository, check it out using > >> > >> git fetch https://github.com/mxm/flink/ > >> f49b9635bec703541f19cb8c615f302a07ea88b3 && git checkout FETCH_HEAD > >> > >> Alternatively, here's

Re: Flink job on secure Yarn fails after many hours

2015-12-02 Thread Niels Basjes
> not very crucial if this fails once for the running job. Possibly, we > could work around this problem by retrying N times in case of an > exception. > > Would it be possible for you to deploy a custom Flink 0.10.1 version > we provide and test again? > > On Wed, Dec 2, 2015 at 4

Re: Flink job on secure Yarn fails after many hours

2015-12-02 Thread Niels Basjes
wrote: > Hi Niels, > > You mentioned you have the option to update Hadoop and redeploy the > job. Would be great if you could do that and let us know how it turns > out. > > Cheers, > Max > > On Wed, Dec 2, 2015 at 3:45 PM, Niels Basjes wrote: > > Hi, > >

Re: Flink job on secure Yarn fails after many hours

2015-12-02 Thread Niels Basjes
is clear that this exception occurs upon > requesting container status information from the Resource Manager: > > >at > org.apache.flink.yarn.YarnJobManager$$anonfun$handleYarnMessage$1.applyOrElse(YarnJobManager.scala:259) > > Are there any more exceptions in the log? Do you have

Flink job on secure Yarn fails after many hours

2015-12-02 Thread Niels Basjes
g (in either Hadoop or Flink) or am I doing something wrong? Would upgrading Yarn to 2.7.1 (i.e. HDP 2.3) fix this? Niels Basjes 21:30:27,821 WARN org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:nbasjes (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteExce

Re: Cleanup of OperatorStates?

2015-12-01 Thread Niels Basjes
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:440) at org.apache.flink.streaming.runtime.tasks.StreamTask$1.onEvent(StreamTask.java:574) ... 8 more Niels On Tue, Dec 1, 2015 at 4:41 PM, Niels Basjes wrote: > Thanks! > I'm going to study this code closely! > > Nie

  1   2   >