Kostas and Gordon, Thanks for the suggestions! I'm on RocksDB. We don't have that setting configured so it should be at the default 1024b. This is the full "state.*" section showing in the JobManager UI.
[image: Screen Shot 2020-03-04 at 9.56.20 AM.png] Jacob On Wed, Mar 4, 2020 at 2:45 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > Hi Jacob, > > Apart from what Klou already mentioned, one slightly possible reason: > > If you are using the FsStateBackend, it is also possible that your state > is small enough to be considered to be stored inline within the metadata > file. > That is governed by the "state.backend.fs.memory-threshold" configuration, > with a default value of 1024 bytes, or can also be configured with the > `fileStateSizeThreshold` argument when constructing the `FsStateBackend`. > The purpose of that threshold is to ensure that the backend does not > create a large amount of very small files, where potentially the file > pointers are actually larger than the state itself. > > Cheers, > Gordon > > > > On Wed, Mar 4, 2020 at 6:17 PM Kostas Kloudas <kklou...@gmail.com> wrote: > >> Hi Jacob, >> >> Could you specify which StateBackend you are using? >> >> The reason I am asking is that, from the documentation in [1]: >> >> "Note that if you use the MemoryStateBackend, metadata and savepoint >> state will be stored in the _metadata file. Since it is >> self-contained, you may move the file and restore from any location." >> >> I am also cc'ing Gordon who may know a bit more about state formats. >> >> I hope this helps, >> Kostas >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/savepoints.html >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Drelease-2D1.6_ops_state_savepoints.html&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=lTq5mEceM-U-tVfWzKBngg&m=Gj8rciOHU7hUM_QxeMOSC8QqWhJcx_q9M8mrdNqdcm8&s=viMyoVEHWkMIil_1RSpjvlbQx9AFO6C-Sk6oe0U_x40&e=> >> >> On Wed, Mar 4, 2020 at 1:25 AM Jacob Sevart <jsev...@uber.com> wrote: >> > >> > Per the documentation: >> > >> > "The meta data file of a Savepoint contains (primarily) pointers to all >> files on stable storage that are part of the Savepoint, in form of absolute >> paths." >> > >> > I somehow have a _metadata file that's 1.9GB. Running strings on it I >> find 962 strings, most of which look like HDFS paths, which leaves a lot of >> that file-size unexplained. What else is in there, and how exactly could >> this be happening? >> > >> > We're running 1.6. >> > >> > Jacob >> > -- Jacob Sevart Software Engineer, Safety