[ 
https://issues.apache.org/jira/browse/IGNITE-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252520#comment-15252520
 ] 

Ivan Veselovsky edited comment on IGNITE-2938 at 4/21/16 7:43 PM:
------------------------------------------------------------------

The problem description: 
IgfsDataManager#dataBlock() method has the following structure (simplified 
code):
{code}
  Future<byte[]> f = dataCache.getAsync(blockKey);

  f = f.chain({
       byte[] result = f0.get();

       if (result == null) {
          result = secondaryReader.read(...);
       }

       putSafe(blockKey, result);

       return result;
  })

  return f;
{code} 

, where {code}putSafe{code} is a kind of async put to data cache executed in a 
special pool.

This code invoked from 2 places: 
1) IgfsMetaManager#appendDual() -- in this case it reads the last incomplete 
block of the file to append data to it.
2) IgfsInputStreamImpl -- in this case we just read all the blocks of the file.

Original problem observed in test is that the datacache.put() operation invoked 
asynchronously from putSafe() can overwrite the new block written during the 
append action.

The following statement seems to be true: only *the last incomplete block of 
the file* can cause such conflict. This is true because all the other blocks 
can be written only once from IGFS side, because the IGFS API only allows to 
append file, no random access write is possible. Also, IGFS secondary fs model 
does not assume that a file can be modified "externally" on the secondary file 
system. If that happens, the file should be completely re-read, and this new 
file will have new Ids in all caches. So, all the blocks except the last, are 
constant. But we should bear in mind that they can disappear from data cache as 
a result of eviction, and in such case they will be re-read from secondary file 
system again. 

So, the following change is proposed to fix the problem:
0) Introduce a new filed of IgfsEntryInfo#version (exact name TBD), that would 
uniquely identify the file data state.
1) in case of append operation we may put the last incomplete block to the data 
cache *only* before we acquired write lock (meta#invokeLock()) on the file. 
Namely , the pseudo-code of putSafe task should look like this:
  {code}  
  try (Transaction tx = metaCache.startTx()) {
    IgfsFileInfo info = metaCahce.get(fileId); 
    long currentVersion = info.version();
    IgniteUuid lockId = info.lockId();
    if (currentVersion == initialVersion && lockId == null) {
       dataCache.putIfAbsent(blockKey, block);   // "put if absent" because 
there is no use case when we should overwrite anything there.
    } 
  }
{code}
, where initialVersion is the file version at the moment of append operation 
begin.
currentVersion == initialVersion condition is because if a file was appended 
several times, the block we have is obsolete -- we should drop it.
lockId == null conditiuon is because after we acquired the lock, the new block 
may be written at any time. We do "putIfAbsent", so it seems, we may put even 
after the lock is acquired. But this is not the case because after append 
operation has put a new block, this block may be evicted immediately, and the 
block is absent (null) again. In this case we'll get again a wrong situation , 
when an old block will be written over the new one.

In IgfsInputStreamInpl read implementation we should apply exactly the same 
async put logic for the last incomplete block. 
(The last block index can always be easily calculated provided we have 
IgfsEntryInfo: this is fileLength / blockSize if fileLength % blockSize != 0.)

When we read other blocks (except the last incomplete one) in 
IgfsInputStreamInpl, we should *only* check that the file still exists and is 
not scheduled for deletion. Otherwise we'll just cache a garbage.
We don't need any version or lockId checks because these blocks cannot be 
modified (if they are modified externally , then we should re-create the IGFS 
file completely).
So, this code should look like 
{code}
  try (Transaction tx = metaCache.startTx()) {
    IgfsFileInfo info = metaCahce.get(fileId); 
    if (info != null) {
      IgniteUuid lockId = info.lockId();
      // Note that we don't check if lockId is null or not, because these 
blocks must be constant:
      if (lockId != DELETE_LOCK_ID) {
         dataCache.putIfAbsent(blockKey, block); //We cannot assert result 
there because concurrent InputStream might have performed the same put already.
      } 
   }
  }
{code}




was (Author: iveselovskiy):
The problem description: 
IgfsDataManager#dataBlock() method has the following structure (simplified 
code):
{code}
  Future<byte[]> f = dataCache.getAsync(blockKey);

  f = f.chain({
       byte[] result = f0.get();

       if (result == null) {
          result = secondaryReader.read(...);
       }

       putSafe(blockKey, result);

       return result;
  })

  return f;
{code} 

, where {code}putSafe{code} is a kind of async put to data cache executed in a 
special pool.

This code invoked from 2 places: 
1) IgfsMetaManager#appendDual() -- in this case it reads the last incomplete 
block of the file to append data to it.
2) IgfsInputStreamImpl -- in this case we just read all the blocks of the file.

Original problem observed in test is that the datacache.put() operation invoked 
asynchronously from putSafe() can overwrite the new block written during the 
append action.

The following statement seems to be true: only *the last incomplete block of 
the file* can cause such conflict. This is true because all the other blocks 
can be written only once from IGFS side, because the IGFS API only allows to 
append file, no random access write is possible. Also, IGFS secondary fs model 
does not assume that a file can be modified "externally" on the secondary file 
system. If that happens, the file should be completely re-read, and this new 
file will have new Ids in all caches. So, all the blocks except the last, are 
constant. But we should bear in mind that they can disappear from data cache as 
a result of eviction, and in such case they will be re-read from secondary file 
system again. 

So, the following change is proposed to fix the problem:
0) Introduce a new filed of IgfsEntryInfo#version (exact name TBD), that would 
uniquely identify the file data state.
1) in case of append operation we may put the last incomplete block to the data 
cache *only* before we acquired write lock (meta#invokeLock()) on the file. 
Namely , the pseudo-code of putSafe task should look like this:
  {code}  
  try (Transaction tx = metaCache.startTx()) {
    IgfsFileInfo info = metaCahce.get(fileId); 
    long currentVersion = info.version();
    IgniteUuid lockId = info.lockId();
    if (currentVersion == initialVersion && lockId == null) {
       dataCache.putIfAbsent(blockKey, block);   // "put if absent" because 
there is no use case when we should overwrite anything there.
    } 
  }
{code}
, where initialVersion is the file version at the moment of append operation 
begin.
currentVersion == initialVersion condition is because if a file was appended 
several times, the block we have is obsolete -- we should drop it.
lockId == null conditiuon is because after we acquired the lock, the new block 
may be written at any time. We do "putIfAbsent", so it seems, we may put even 
after the lock is acquired. But this is not the case because after append 
operation has put a new block, this block may be evicted immediately, and the 
block is absent (null) again. In this case we'll get again a wrong situation , 
when an old block will be written over the new one.

In IgfsInputStreamInpl read implementation we should apply exactly the same 
async put logic for the last incomplete block. 
(The last block index can always be easily calculated provided we have 
IgfsEntryInfo: this is fileLength / blockSize if fileLength % blockSize != 0.)

When we read other blocks (except the last incomplete one) in 
IgfsInputStreamInpl, we should *only* check that the file still exists and is 
not scheduled for deletion. Otherwise we'll just cache a garbage.
We don't need any version or lockId checks because these blocks cannot be 
modified (if they are modified externally , then we should re-create the IGFS 
file completely).
So, this code should look like 
{code}
  try (Transaction tx = metaCache.startTx()) {
    IgfsFileInfo info = metaCahce.get(fileId); 
    if (info != null) {
      IgniteUuid lockId = info.lockId();
      // Note that we don't check if lockId is null or not, because these 
blocks must be constant:
      if (lockId != DELETE_LOCK_ID) {
         dataCache.putIfAbsent(blockKey, block); //We cannot assert result 
there because concurrent InputStream might have performed the same put already.
      } 
   }
  }
{code}

This logic

> IgfsBackupsDualAsyncSelfTest.testAppendParentMissing and 
> IgfsBackupsDualAsyncSelfTest.testAppendParentMissingPartially fail sometimes 
> on master
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-2938
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2938
>             Project: Ignite
>          Issue Type: Test
>          Components: IGFS
>    Affects Versions: 1.5.0.final
>            Reporter: Ivan Veselovsky
>            Assignee: Ivan Veselovsky
>             Fix For: 1.6
>
>
> Tests 
>   IgfsBackupsDualAsyncSelfTest.testAppendParentMissing        
>   IgfsBackupsDualAsyncSelfTest.testAppendParentMissingPartially       
> fail from time to time on master -- need to investigate. 
> It looks like that started to happen after fix  
> https://issues.apache.org/jira/browse/IGNITE-1631 .
> The failure happens with probability ~1/50 :
> {code}
> [18:14:50,772][INFO ][main][root] >>> Starting test: 
> IgfsBackupsDualAsyncSelfTest#testAppendParentMissingPartially <<<
> [18:14:50,792][ERROR][main][root] Test failed.
> java.io.IOException: Inconsistent file's data block (incorrectly written?) 
> [path=/dir/subdir/file, blockIdx=0, blockSize=128, expectedBlockSize=256, 
> fileBlockSize=524288, fileLen=256]
>       at 
> org.apache.ignite.internal.processors.igfs.IgfsInputStreamImpl.block(IgfsInputStreamImpl.java:485)
>       at 
> org.apache.ignite.internal.processors.igfs.IgfsInputStreamImpl.blockFragmentizerSafe(IgfsInputStreamImpl.java:399)
>       at 
> org.apache.ignite.internal.processors.igfs.IgfsInputStreamImpl.readFromStore(IgfsInputStreamImpl.java:373)
>       at 
> org.apache.ignite.internal.processors.igfs.IgfsInputStreamImpl.readFully(IgfsInputStreamImpl.java:222)
>       at 
> org.apache.ignite.internal.processors.igfs.IgfsInputStreamImpl.readFully(IgfsInputStreamImpl.java:216)
>       at 
> org.apache.ignite.internal.processors.igfs.IgfsAbstractSelfTest.checkFileContent(IgfsAbstractSelfTest.java:2999)
>       at 
> org.apache.ignite.internal.processors.igfs.IgfsAbstractSelfTest.checkFile(IgfsAbstractSelfTest.java:2969)
>       at 
> org.apache.ignite.internal.processors.igfs.IgfsDualAbstractSelfTest.testAppendParentMissingPartially(IgfsDualAbstractSelfTest.java:1364)
>       at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at junit.framework.TestCase.runTest(TestCase.java:176)
>       at 
> org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:1759)
>       at 
> org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:118)
>       at 
> org.apache.ignite.testframework.junits.GridAbstractTest$4.run(GridAbstractTest.java:1697)
>       at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to