[jira] [Commented] (IGNITE-3877) Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as blockSize
[ https://issues.apache.org/jira/browse/IGNITE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784785#comment-15784785 ] Vladimir Ozerov commented on IGNITE-3877: - Closing as the whole fix is inside IGNITE-481. > Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as > blockSize > - > > Key: IGNITE-3877 > URL: https://issues.apache.org/jira/browse/IGNITE-3877 > Project: Ignite > Issue Type: Bug > Components: IGFS >Affects Versions: 1.6 >Reporter: Ivan Veselovsky >Assignee: Vladimir Ozerov > Fix For: 2.0 > > > During Metrics tests repairing test > org.apache.ignite.igfs.Hadoop1DualAbstractTest#testMetricsBlock revealed the > following problem: > org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem#convert(org.apache.ignite.igfs.IgfsFile) > method treats groupBlockSize as blockSize for Hadoop FileStatus. > groupBlockSize can be several times larger than blockSize, so blockSize in > status gets different to that in original IgfsFile . > changing file.groupBlockSize() to file.blockSize() fixes problem in metrics > tests, but creates problems in Hadoop tests that are bound to splits > calculation, since split calculation related to blockSizes. > Need to > 1) clarify if the treatment of groupBlcokSize was intentional. > 2) fix either metrics tests or Hadoop tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (IGNITE-3877) Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as blockSize
[ https://issues.apache.org/jira/browse/IGNITE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15744890#comment-15744890 ] Ivan Veselovsky commented on IGNITE-3877: - The problem is in mixing notions GroupBlockSize and BlockSize. As comment to class {code}org.apache.ignite.igfs.IgfsGroupDataBlocksKeyMapper{code} states, when {code}org.apache.hadoop.fs.FileSystem{code} lies upon IGFS, the Hadoop Fs block size equals to underlying IGFS group block size (which, in turn, is the block size multiplied by groupSize). When we have IGFS over Hadoop FileSystem, we also use the group block size as the block size of the created secondary Fs files (see {code}org.apache.ignite.internal.processors.igfs.IgfsSecondaryFileSystemCreateContext#create{code}). This way, when reading files we should follow the same logic: IGFS *group* block size == Hadoop block size, and IGFS block size is just the configuration value ({code}org.apache.ignite.configuration.FileSystemConfiguration#getBlockSize{code}) . This way we make the logic consistent, and fix the assertion issue described above. > Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as > blockSize > - > > Key: IGNITE-3877 > URL: https://issues.apache.org/jira/browse/IGNITE-3877 > Project: Ignite > Issue Type: Bug > Components: IGFS >Affects Versions: 1.6 >Reporter: Ivan Veselovsky >Assignee: Vladimir Ozerov > Fix For: 2.0 > > > During Metrics tests repairing test > org.apache.ignite.igfs.Hadoop1DualAbstractTest#testMetricsBlock revealed the > following problem: > org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem#convert(org.apache.ignite.igfs.IgfsFile) > method treats groupBlockSize as blockSize for Hadoop FileStatus. > groupBlockSize can be several times larger than blockSize, so blockSize in > status gets different to that in original IgfsFile . > changing file.groupBlockSize() to file.blockSize() fixes problem in metrics > tests, but creates problems in Hadoop tests that are bound to splits > calculation, since split calculation related to blockSizes. > Need to > 1) clarify if the treatment of groupBlcokSize was intentional. > 2) fix either metrics tests or Hadoop tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (IGNITE-3877) Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as blockSize
[ https://issues.apache.org/jira/browse/IGNITE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485008#comment-15485008 ] Ivan Veselovsky commented on IGNITE-3877: - It appears that block size in IgfsFile and in IgfsFileInfo are different for the same file (code sample below). This happens because of org.apache.ignite.internal.processors.igfs.IgfsImpl#create0 , where block siae is simply taken from cfg.getBlockSize() , but all the other considerations are ignored. It looks like this is okay to have block size different for primary and underlying file systems (due to the group size feature), but from promary Fs viewpoint the block size should be consistent. So, I guess, org.apache.ignite.internal.processors.igfs.IgfsImpl#create0() should be fixed: block size there should be taken from in the same way as in org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem#convert(org.apache.ignite.igfs.IgfsFile) , that is , via groupBlockSize(). {code} igfs.create(file1, 256, true, null, 1, 256, null).close(); IgfsFile f1 = igfs.info(file1); int blockSize = f1.blockSize(); IgfsEntryInfo info = ((IgfsImpl)igfs).meta.infoForPath(file1); final int blockSize2 = info.blockSize(); assert blockSize == blockSize2; // this fails {code} > Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as > blockSize > - > > Key: IGNITE-3877 > URL: https://issues.apache.org/jira/browse/IGNITE-3877 > Project: Ignite > Issue Type: Bug > Components: IGFS >Affects Versions: 1.6 >Reporter: Ivan Veselovsky >Assignee: Ivan Veselovsky > Fix For: 1.8 > > > During Metrics tests repairing test > org.apache.ignite.igfs.Hadoop1DualAbstractTest#testMetricsBlock revealed the > following problem: > org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem#convert(org.apache.ignite.igfs.IgfsFile) > method treats groupBlockSize as blockSize for Hadoop FileStatus. > groupBlockSize can be several times larger than blockSize, so blockSize in > status gets different to that in original IgfsFile . > changing file.groupBlockSize() to file.blockSize() fixes problem in metrics > tests, but creates problems in Hadoop tests that are bound to splits > calculation, since split calculation related to blockSizes. > Ned to > 1) clarify if the treatment of groupBlcokSize was intentional. > 2) fix either metrics tests or Hadoop tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)