Re: Broadcast state
By using Redis, you can store all data in one job in one single Redis, no need one slot one Redis, what do you think? Best, Congxian Navneeth Krishnan 于2019年10月18日周五 上午4:47写道: > Ya, there will not be a problem of duplicates. But what I'm trying to > achieve is if there a large static state which needs to be present just one > per node rather than storing it per slot that would be ideal. The reason > being is that the state is quite large around 100GB of mostly static data > and it is not needed at per slot level. It can be at per instance level > where each slot can read from this shared memory. > > Thanks > > On Wed, Oct 9, 2019 at 12:13 AM Congxian Qiu > wrote: > >> Hi, >> >> After using Redis, why there need to care about eliminate duplicated >> data, if you specify the same key, then Redis will do the deduplicate >> things. >> >> Best, >> Congxian >> >> >> Fabian Hueske 于2019年10月2日周三 下午5:30写道: >> >>> Hi, >>> >>> State is always associated with a single task in Flink. >>> The state of a task cannot be accessed by other tasks of the same >>> operator or tasks of other operators. >>> This is true for every type of state, including broadcast state. >>> >>> Best, Fabian >>> >>> >>> Am Di., 1. Okt. 2019 um 08:22 Uhr schrieb Navneeth Krishnan < >>> reachnavnee...@gmail.com>: >>> Hi, I can use redis but I’m still having hard time figuring out how I can eliminate duplicate data. Today without broadcast state in 1.4 I’m using cache to lazy load the data. I thought the broadcast state will be similar to that of kafka streams where I have read access to the state across the pipeline. That will indeed solve a lot of problems. Is there some way I can do the same with flink? Thanks! On Mon, Sep 30, 2019 at 10:36 PM Congxian Qiu wrote: > Hi, > > Could you use some cache system such as HBase or Reids to storage this > data, and query from the cache if needed? > > Best, > Congxian > > > Navneeth Krishnan 于2019年10月1日周二 上午10:15写道: > >> Thanks Oytun. The problem with doing that is the same data will be >> have to be stored multiple times wasting memory. In my case there will >> around million entries which needs to be used by at least two operators >> for >> now. >> >> Thanks >> >> On Mon, Sep 30, 2019 at 5:42 PM Oytun Tez wrote: >> >>> This is how we currently use broadcast state. Our states are >>> re-usable (code-wise), every operator that wants to consume basically >>> keeps >>> the same descriptor state locally by processBroadcastElement'ing into a >>> local state. >>> >>> I am open to suggestions. I see this as a hard drawback of dataflow >>> programming or Flink framework? >>> >>> >>> >>> --- >>> Oytun Tez >>> >>> *M O T A W O R D* >>> The World's Fastest Human Translation Platform. >>> oy...@motaword.com — www.motaword.com >>> >>> >>> On Mon, Sep 30, 2019 at 8:40 PM Oytun Tez >>> wrote: >>> You can re-use the broadcasted state (along with its descriptor) that comes into your KeyedBroadcastProcessFunction, in another operator downstream. that's basically duplicating the broadcasted state whichever operator you want to use, every time. --- Oytun Tez *M O T A W O R D* The World's Fastest Human Translation Platform. oy...@motaword.com — www.motaword.com On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan < reachnavnee...@gmail.com> wrote: > Hi All, > > Is it possible to access a broadcast state across the pipeline? > For example, say I have a KeyedBroadcastProcessFunction which adds the > incoming data to state and I have downstream operator where I need > the same > state as well, would I be able to just read the broadcast state with a > readonly view. I know this is possible in kafka streams. > > Thanks >
Re: [ANNOUNCE] Apache Flink 1.9.1 released
Thanks a lot to Jark, Jincheng, and everyone that make this release possible. Best, Hequn On Sat, Oct 19, 2019 at 10:29 PM Zili Chen wrote: > Thanks a lot for being release manager Jark. Great work! > > Best, > tison. > > > Till Rohrmann 于2019年10月19日周六 下午10:15写道: > >> Thanks a lot for being our release manager Jark and thanks to everyone >> who has helped to make this release possible. >> >> Cheers, >> Till >> >> On Sat, Oct 19, 2019 at 3:26 PM Jark Wu wrote: >> >>> Hi, >>> >>> The Apache Flink community is very happy to announce the release of >>> Apache Flink 1.9.1, which is the first bugfix release for the Apache Flink >>> 1.9 series. >>> >>> Apache Flink® is an open-source stream processing framework for >>> distributed, high-performing, always-available, and accurate data streaming >>> applications. >>> >>> The release is available for download at: >>> https://flink.apache.org/downloads.html >>> >>> Please check out the release blog post for an overview of the >>> improvements for this bugfix release: >>> https://flink.apache.org/news/2019/10/18/release-1.9.1.html >>> >>> The full release notes are available in Jira: >>> https://issues.apache.org/jira/projects/FLINK/versions/12346003 >>> >>> We would like to thank all contributors of the Apache Flink community >>> who helped to verify this release and made this release possible! >>> Great thanks to @Jincheng for helping finalize this release. >>> >>> Regards, >>> Jark Wu >>> >>
Re: [ANNOUNCE] Apache Flink 1.9.1 released
Thanks a lot to Jark, Jincheng, and everyone that make this release possible. Best, Hequn On Sat, Oct 19, 2019 at 10:29 PM Zili Chen wrote: > Thanks a lot for being release manager Jark. Great work! > > Best, > tison. > > > Till Rohrmann 于2019年10月19日周六 下午10:15写道: > >> Thanks a lot for being our release manager Jark and thanks to everyone >> who has helped to make this release possible. >> >> Cheers, >> Till >> >> On Sat, Oct 19, 2019 at 3:26 PM Jark Wu wrote: >> >>> Hi, >>> >>> The Apache Flink community is very happy to announce the release of >>> Apache Flink 1.9.1, which is the first bugfix release for the Apache Flink >>> 1.9 series. >>> >>> Apache Flink® is an open-source stream processing framework for >>> distributed, high-performing, always-available, and accurate data streaming >>> applications. >>> >>> The release is available for download at: >>> https://flink.apache.org/downloads.html >>> >>> Please check out the release blog post for an overview of the >>> improvements for this bugfix release: >>> https://flink.apache.org/news/2019/10/18/release-1.9.1.html >>> >>> The full release notes are available in Jira: >>> https://issues.apache.org/jira/projects/FLINK/versions/12346003 >>> >>> We would like to thank all contributors of the Apache Flink community >>> who helped to verify this release and made this release possible! >>> Great thanks to @Jincheng for helping finalize this release. >>> >>> Regards, >>> Jark Wu >>> >>
Submitting jobs via REST
I have a flink docker image with my job's JAR already contained within. I would like to run a job with this jar via the REST api. Is that possible? I know I can run a job via REST using JarID (ID assigned by flink when a jar is uploaded). However I don't have such an ID since this jar is already part of the image. Via CLI I can start a job using classpath. But can I do the same via the REST api. Any other ways to achieve this? Thanks Tim
Re: [ANNOUNCE] Apache Flink 1.9.1 released
Thanks a lot for being release manager Jark. Great work! Best, tison. Till Rohrmann 于2019年10月19日周六 下午10:15写道: > Thanks a lot for being our release manager Jark and thanks to everyone who > has helped to make this release possible. > > Cheers, > Till > > On Sat, Oct 19, 2019 at 3:26 PM Jark Wu wrote: > >> Hi, >> >> The Apache Flink community is very happy to announce the release of >> Apache Flink 1.9.1, which is the first bugfix release for the Apache Flink >> 1.9 series. >> >> Apache Flink® is an open-source stream processing framework for >> distributed, high-performing, always-available, and accurate data streaming >> applications. >> >> The release is available for download at: >> https://flink.apache.org/downloads.html >> >> Please check out the release blog post for an overview of the >> improvements for this bugfix release: >> https://flink.apache.org/news/2019/10/18/release-1.9.1.html >> >> The full release notes are available in Jira: >> https://issues.apache.org/jira/projects/FLINK/versions/12346003 >> >> We would like to thank all contributors of the Apache Flink community who >> helped to verify this release and made this release possible! >> Great thanks to @Jincheng for helping finalize this release. >> >> Regards, >> Jark Wu >> >
Re: [ANNOUNCE] Apache Flink 1.9.1 released
Thanks a lot for being release manager Jark. Great work! Best, tison. Till Rohrmann 于2019年10月19日周六 下午10:15写道: > Thanks a lot for being our release manager Jark and thanks to everyone who > has helped to make this release possible. > > Cheers, > Till > > On Sat, Oct 19, 2019 at 3:26 PM Jark Wu wrote: > >> Hi, >> >> The Apache Flink community is very happy to announce the release of >> Apache Flink 1.9.1, which is the first bugfix release for the Apache Flink >> 1.9 series. >> >> Apache Flink® is an open-source stream processing framework for >> distributed, high-performing, always-available, and accurate data streaming >> applications. >> >> The release is available for download at: >> https://flink.apache.org/downloads.html >> >> Please check out the release blog post for an overview of the >> improvements for this bugfix release: >> https://flink.apache.org/news/2019/10/18/release-1.9.1.html >> >> The full release notes are available in Jira: >> https://issues.apache.org/jira/projects/FLINK/versions/12346003 >> >> We would like to thank all contributors of the Apache Flink community who >> helped to verify this release and made this release possible! >> Great thanks to @Jincheng for helping finalize this release. >> >> Regards, >> Jark Wu >> >
Re: [ANNOUNCE] Apache Flink 1.9.1 released
Thanks a lot for being our release manager Jark and thanks to everyone who has helped to make this release possible. Cheers, Till On Sat, Oct 19, 2019 at 3:26 PM Jark Wu wrote: > Hi, > > The Apache Flink community is very happy to announce the release of Apache > Flink 1.9.1, which is the first bugfix release for the Apache Flink 1.9 > series. > > Apache Flink® is an open-source stream processing framework for > distributed, high-performing, always-available, and accurate data streaming > applications. > > The release is available for download at: > https://flink.apache.org/downloads.html > > Please check out the release blog post for an overview of the improvements > for this bugfix release: > https://flink.apache.org/news/2019/10/18/release-1.9.1.html > > The full release notes are available in Jira: > https://issues.apache.org/jira/projects/FLINK/versions/12346003 > > We would like to thank all contributors of the Apache Flink community who > helped to verify this release and made this release possible! > Great thanks to @Jincheng for helping finalize this release. > > Regards, > Jark Wu >
Re: [ANNOUNCE] Apache Flink 1.9.1 released
Thanks a lot for being our release manager Jark and thanks to everyone who has helped to make this release possible. Cheers, Till On Sat, Oct 19, 2019 at 3:26 PM Jark Wu wrote: > Hi, > > The Apache Flink community is very happy to announce the release of Apache > Flink 1.9.1, which is the first bugfix release for the Apache Flink 1.9 > series. > > Apache Flink® is an open-source stream processing framework for > distributed, high-performing, always-available, and accurate data streaming > applications. > > The release is available for download at: > https://flink.apache.org/downloads.html > > Please check out the release blog post for an overview of the improvements > for this bugfix release: > https://flink.apache.org/news/2019/10/18/release-1.9.1.html > > The full release notes are available in Jira: > https://issues.apache.org/jira/projects/FLINK/versions/12346003 > > We would like to thank all contributors of the Apache Flink community who > helped to verify this release and made this release possible! > Great thanks to @Jincheng for helping finalize this release. > > Regards, > Jark Wu >
Re: [ANNOUNCE] Apache Flink 1.9.1 released
On 2019/10/19 9:07 下午, Jark Wu wrote: The Apache Flink community is very happy to announce the release of Apache Flink 1.9.1, which is the first bugfix release for the Apache Flink 1.9 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. that's awesome. thanks for your work. regards.
[ANNOUNCE] Apache Flink 1.9.1 released
Hi, The Apache Flink community is very happy to announce the release of Apache Flink 1.9.1, which is the first bugfix release for the Apache Flink 1.9 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. The release is available for download at: https://flink.apache.org/downloads.html Please check out the release blog post for an overview of the improvements for this bugfix release: https://flink.apache.org/news/2019/10/18/release-1.9.1.html The full release notes are available in Jira: https://issues.apache.org/jira/projects/FLINK/versions/12346003 We would like to thank all contributors of the Apache Flink community who helped to verify this release and made this release possible! Great thanks to @Jincheng for helping finalize this release. Regards, Jark Wu
[ANNOUNCE] Apache Flink 1.9.1 released
Hi, The Apache Flink community is very happy to announce the release of Apache Flink 1.9.1, which is the first bugfix release for the Apache Flink 1.9 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. The release is available for download at: https://flink.apache.org/downloads.html Please check out the release blog post for an overview of the improvements for this bugfix release: https://flink.apache.org/news/2019/10/18/release-1.9.1.html The full release notes are available in Jira: https://issues.apache.org/jira/projects/FLINK/versions/12346003 We would like to thank all contributors of the Apache Flink community who helped to verify this release and made this release possible! Great thanks to @Jincheng for helping finalize this release. Regards, Jark Wu
Re: Customize Part file naming (Flink 1.9.0)
Beware when using Bucketing sink as it does not follow exactly once semantics. Also it has issues with s3 consistency. On Sat, Oct 19, 2019, 1:42 PM Ravi Bhushan Ratnakar < ravibhushanratna...@gmail.com> wrote: > Hi, > > As an alternative, you may use BucketingSink which provides you the > provision to customize suffix/prefix. > > https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html > > Regards, > Ravi > > On Sat, Oct 19, 2019 at 3:54 AM amran dean wrote: > >> Hello, >> StreamingFileSink's part file naming convention is not adjustable. It has >> form: *part--. * >> >> My use case for StreamingFileSink is a Kafka -> S3 pipeline, and files >> are read and processed from S3 using spark. In almost all cases, I want to >> compress raw data before writing to S3 using the BulkFormat. >> >> Spark relies on filename extensions to do compression inference, so the >> current naming scheme results in gibberish. I see that 1.10 currently >> provides the ability to customize the suffix/prefix, but I really need an >> alternative solution to this as soon as possible. Can this be backported to >> 1.9, or are there alternatives? >> >> >>
Re: Customize Part file naming (Flink 1.9.0)
Hi, As an alternative, you may use BucketingSink which provides you the provision to customize suffix/prefix. https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html Regards, Ravi On Sat, Oct 19, 2019 at 3:54 AM amran dean wrote: > Hello, > StreamingFileSink's part file naming convention is not adjustable. It has > form: *part--. * > > My use case for StreamingFileSink is a Kafka -> S3 pipeline, and files are > read and processed from S3 using spark. In almost all cases, I want to > compress raw data before writing to S3 using the BulkFormat. > > Spark relies on filename extensions to do compression inference, so the > current naming scheme results in gibberish. I see that 1.10 currently > provides the ability to customize the suffix/prefix, but I really need an > alternative solution to this as soon as possible. Can this be backported to > 1.9, or are there alternatives? > > >