Kostas and Gordon,

Thanks for the suggestions! I'm on RocksDB. We don't have that setting
configured so it should be at the default 1024b. This is the full "state.*"
section showing in the JobManager UI.

[image: Screen Shot 2020-03-04 at 9.56.20 AM.png]

Jacob

On Wed, Mar 4, 2020 at 2:45 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org>
wrote:

> Hi Jacob,
>
> Apart from what Klou already mentioned, one slightly possible reason:
>
> If you are using the FsStateBackend, it is also possible that your state
> is small enough to be considered to be stored inline within the metadata
> file.
> That is governed by the "state.backend.fs.memory-threshold" configuration,
> with a default value of 1024 bytes, or can also be configured with the
> `fileStateSizeThreshold` argument when constructing the `FsStateBackend`.
> The purpose of that threshold is to ensure that the backend does not
> create a large amount of very small files, where potentially the file
> pointers are actually larger than the state itself.
>
> Cheers,
> Gordon
>
>
>
> On Wed, Mar 4, 2020 at 6:17 PM Kostas Kloudas <kklou...@gmail.com> wrote:
>
>> Hi Jacob,
>>
>> Could you specify which StateBackend you are using?
>>
>> The reason I am asking is that, from the documentation in [1]:
>>
>> "Note that if you use the MemoryStateBackend, metadata and savepoint
>> state will be stored in the _metadata file. Since it is
>> self-contained, you may move the file and restore from any location."
>>
>> I am also cc'ing Gordon who may know a bit more about state formats.
>>
>> I hope this helps,
>> Kostas
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/savepoints.html
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Drelease-2D1.6_ops_state_savepoints.html&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=lTq5mEceM-U-tVfWzKBngg&m=Gj8rciOHU7hUM_QxeMOSC8QqWhJcx_q9M8mrdNqdcm8&s=viMyoVEHWkMIil_1RSpjvlbQx9AFO6C-Sk6oe0U_x40&e=>
>>
>> On Wed, Mar 4, 2020 at 1:25 AM Jacob Sevart <jsev...@uber.com> wrote:
>> >
>> > Per the documentation:
>> >
>> > "The meta data file of a Savepoint contains (primarily) pointers to all
>> files on stable storage that are part of the Savepoint, in form of absolute
>> paths."
>> >
>> > I somehow have a _metadata file that's 1.9GB. Running strings on it I
>> find 962 strings, most of which look like HDFS paths, which leaves a lot of
>> that file-size unexplained. What else is in there, and how exactly could
>> this be happening?
>> >
>> > We're running 1.6.
>> >
>> > Jacob
>>
>

-- 
Jacob Sevart
Software Engineer, Safety

Reply via email to