Ivan,

How quick are you going to merge the fix into the master? Many persistence
related optimizations have already stacked up. Probably, we can release
them sooner if the community agrees.

--
Denis

On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov <ivan.glu...@gmail.com> wrote:

> Thanks all!
> We seem to have reached a consensus on this issue. I'll just add necessary
> fsyncs under IGNITE-7754.
>
> Best Regards,
> Ivan Rakov
>
>
> On 22.03.2018 15:13, Ilya Lantukh wrote:
>
>> +1 for fixing LOG_ONLY. If current implementation doesn't protect from
>> data
>> corruption, it doesn't make sence.
>>
>> On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda <dma...@apache.org> wrote:
>>
>> +1 for the fix of LOG_ONLY
>>>
>>> On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
>>> alexey.goncha...@gmail.com> wrote:
>>>
>>> +1 for fixing LOG_ONLY to enforce corruption safety given the provided
>>>> performance results.
>>>>
>>>> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
>>>>
>>>> +1 for accepting drop in LOG_ONLY. 7% is not that much and not a drop
>>>>>
>>>> at
>>>
>>>> all, provided that we fixing a bug. I.e. should we implement it
>>>>>
>>>> correctly
>>>
>>>> in the first place we would never notice any "drop".
>>>>> I do not understand why someone would like to use current broken mode.
>>>>>
>>>>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov <dpavlov....@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi, I think option 1 is better. As Val said any mode that allows
>>>>>>
>>>>> corruption
>>>>>
>>>>>> does not make much sense.
>>>>>>
>>>>>> What Ivan mentioned here as drop, in relation to old mode DEFAULT
>>>>>>
>>>>> (FSYNC
>>>>
>>>>> now), is still significant perfromance boost.
>>>>>>
>>>>>> Sincerely,
>>>>>> Dmitriy Pavlov
>>>>>>
>>>>>> ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <ivan.glu...@gmail.com>:
>>>>>>
>>>>>> I've attached benchmark results to the JIRA ticket.
>>>>>>> We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of
>>>>>>>
>>>>>> WAL
>>>
>>>> compaction enabled flag. It's pretty significant drop: WAL
>>>>>>>
>>>>>> compaction
>>>
>>>> itself gives only ~3% drop.
>>>>>>>
>>>>>>> I see two options here:
>>>>>>> 1) Change LOG_ONLY behavior. That implies that we'll be ready to
>>>>>>>
>>>>>> release
>>>>>
>>>>>> AI 2.5 with 7% drop.
>>>>>>> 2) Introduce LOG_ONLY_SAFE, make it default, add release note to AI
>>>>>>>
>>>>>> 2.5
>>>>
>>>>> that we added power loss durability in default mode, but user may
>>>>>>> fallback to previous LOG_ONLY in order to retain performance.
>>>>>>>
>>>>>>> Thoughts?
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Ivan Rakov
>>>>>>>
>>>>>>> On 20.03.2018 16:00, Ivan Rakov wrote:
>>>>>>>
>>>>>>>> Val,
>>>>>>>>
>>>>>>>> If a storage is in
>>>>>>>>> corrupted state, does it mean that it needs to be completely
>>>>>>>>>
>>>>>>>> removed
>>>>
>>>>> and
>>>>>>
>>>>>>> cluster needs to be restarted without data?
>>>>>>>>>
>>>>>>>> Yes, there's a chance that in LOG_ONLY all local data will be
>>>>>>>>
>>>>>>> lost,
>>>
>>>> but only in *power loss**/ OS crash* case.
>>>>>>>> kill -9, JVM crash, death of critical system thread and all other
>>>>>>>> cases that usually take place are variations of *process crash*.
>>>>>>>>
>>>>>>> All
>>>>
>>>>> WAL modes (except NONE, of course) ensure corruption-safety in
>>>>>>>>
>>>>>>> case
>>>
>>>> of
>>>>>
>>>>>> process crash.
>>>>>>>>
>>>>>>>> If so, I'm not sure any mode
>>>>>>>>> that allows corruption makes much sense to me.
>>>>>>>>>
>>>>>>>> It depends on performance impact of enforcing power-loss
>>>>>>>>
>>>>>>> corruption
>>>
>>>> safety. Price of full protection from power loss is high - FSYNC
>>>>>>>>
>>>>>>> is
>>>
>>>> way slower (2-10 times) than other WAL modes. The question is
>>>>>>>>
>>>>>>> whether
>>>>
>>>>> ensuring weaker guarantees (corruption can't happen, but loss of
>>>>>>>>
>>>>>>> last
>>>>
>>>>> updates can) will affect performance as badly as strong
>>>>>>>>
>>>>>>> guarantees.
>>>
>>>> I'll share benchmark results soon.
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Ivan Rakov
>>>>>>>>
>>>>>>>> On 20.03.2018 5:09, Valentin Kulichenko wrote:
>>>>>>>>
>>>>>>>>> Guys,
>>>>>>>>>
>>>>>>>>> What do we understand under "data corruption" here? If a storage
>>>>>>>>>
>>>>>>>> is
>>>>
>>>>> in
>>>>>
>>>>>> corrupted state, does it mean that it needs to be completely
>>>>>>>>>
>>>>>>>> removed
>>>>
>>>>> and
>>>>>>
>>>>>>> cluster needs to be restarted without data? If so, I'm not sure
>>>>>>>>>
>>>>>>>> any
>>>>
>>>>> mode
>>>>>>
>>>>>>> that allows corruption makes much sense to me. How am I supposed
>>>>>>>>>
>>>>>>>> to
>>>>
>>>>> use a
>>>>>>>>> database, if virtually any failure can end with complete loss of
>>>>>>>>>
>>>>>>>> data?
>>>>>
>>>>>> In any case, this definitely should not be a default behavior.
>>>>>>>>>
>>>>>>>> If
>>>
>>>> user ever
>>>>>>>>> switches to corruption-unsafe mode, there should be a clear
>>>>>>>>>
>>>>>>>> warning
>>>>
>>>>> about
>>>>>>>>> this.
>>>>>>>>>
>>>>>>>>> -Val
>>>>>>>>>
>>>>>>>>> On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <
>>>>>>>>>
>>>>>>>> ivan.glu...@gmail.com>
>>>>
>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Ticket to track changes:
>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-7754
>>>>>>>>>>
>>>>>>>>>> Best Regards,
>>>>>>>>>> Ivan Rakov
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 16.03.2018 10:58, Dmitriy Setrakyan wrote:
>>>>>>>>>>
>>>>>>>>>> On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <
>>>>>>>>>>>
>>>>>>>>>> ivan.glu...@gmail.com
>>>>>
>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Vladimir,
>>>>>>>>>>>
>>>>>>>>>>>> Unlike BACKGROUND, LOG_ONLY provides strict write guarantees
>>>>>>>>>>>> unless power
>>>>>>>>>>>> loss has happened.
>>>>>>>>>>>> Seems like we need to measure performance difference to
>>>>>>>>>>>>
>>>>>>>>>>> decide
>>>
>>>> whether do
>>>>>>>>>>>> we need separate WAL mode. If it will be invisible, we'll
>>>>>>>>>>>>
>>>>>>>>>>> just
>>>
>>>> fix
>>>>>
>>>>>> these
>>>>>>>>>>>> bugs without introducing new mode; if it will be perceptible,
>>>>>>>>>>>>
>>>>>>>>>>> we'll
>>>>>
>>>>>> continue the discussion about introducing LOG_ONLY_SAFE.
>>>>>>>>>>>> Makes sense?
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, this sounds like the right approach.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>
>>
>

Reply via email to