Many thanks for digging into the UDF specification and libcdio code to answer the question.
Thomas - you do have commit rights to libcdio, so if you want to fix the faulty code, by all means do so. Or if you want to make a patch and send that to me and/or have me double check, I'd be happy to do that. Thanks. On Fri, Oct 22, 2010 at 3:18 AM, Thomas Schmitt <[email protected]> wrote: > Hi, > > Shaya Potter: > > > > I'm just wondering if a file in a UDF file system can be fragmented? > Rocky Bernstein: > > > the code should be following the ECMA-167 specifications, > > > I don't see the guarantees, you are looking for, > > Actually the contrary of the desired rule is announced. > > ECMA-167 4/8.8 says: > "A file shall be described by a File Entry (see 4/14.9) or by an Extended > File Entry (4/14.17), which shall specify the attributes of the file and > the location of the file's recorded data. The data of a file shall be > recorded in either of the following: > - An ordered sequence of extents of logical blocks (see short_ad > (4/14.14.1), > long_ad (4/14.14.2) and ext_ad (4/14.14.3). The extents may be recorded > or unrecorded, and allocated or unallocated. The extents, if specified > as long_ad (4/14.14.2) or ext_ad (4/14.14.3), may be located on different > partitions which may be on different volumes." > > (This quote illustrates why i still procrastinate the endeavor to > understand > UDF enough to produce UDF images.) > > The Allocation Descriptors in ECMA-167 4/14.4 all have 32-bit length > fields. > So i assume that files of 4 GB or more have to be split into multiple > extents. > But these extents may well be recorded as consequtive neighbors so that > they form one single block area. > It all depends on the UDF producing program. > > Another reason for multiple extents might be recording of spare files > where large ranges of 0-bytes are represented as unallocated extents. > > I understand that files on video DVD may not be larger than 1 GB. > So the probability is high that they consist of a single extent. > > > Shaya Potter: > > then does libcdio work? it would seem from my reading that every > > udf_read_block() is basically made as an offset to the start of the file. > > > > i.e. > > > > 1) it calls offset_to_lba to find start sector and length > > 2) computed max # of blocks > > 3) calls udf_read_sectors() w/ that information > > I seems that it is aware of multiple extents and but has problems to > fulfill read requests which cross an extent boundary. > Actually it seems to have a bug with extent limit evaluation. > > In lib/udf/udf_file.c i read these two snippets. > > static lba_t > offset_to_lba(const udf_dirent_t *p_udf_dirent, off_t i_offset, > /*out*/ lba_t *pi_lba, /*out*/ uint32_t *pi_max_size) > ... > /* > * The allocation descriptor field is filled with short_ad's. > * If the offset is beyond the current extent, look for the > * next extent. > */ > do { > ... > } while(i_offset >= icblen); > > lsector = (i_offset / UDF_BLOCKSIZE) + p_icb->pos; > > *pi_max_size = p_icb->len; > > > At usage of offset_to_lba it eventually warns and truncates the read job to > the size of the found extent (which is quite not senseful). > > lba_t i_lba = offset_to_lba(p_udf_dirent, p_udf->i_position, &i_lba, > &i_max_size); > if (i_lba != CDIO_INVALID_LBA) { > uint32_t i_max_blocks = CEILING(i_max_size, UDF_BLOCKSIZE); > if ( i_max_blocks < count ) { > printf("Warning: don't know how to handle yet\n" ); > count = i_max_blocks; > } > ret = udf_read_sectors(p_udf, buf, i_lba, count); > > > Reading single sectors should be safe. > > > The code in offset_to_lba() seems faulty resp. uncoordinated with the usage > in udf_read_block(): > If *pi_max_size is intended to give the readable bytes in the found > extent beginning at the current read position, then one should subtract > i_offset from it. > > To make it fully able to deal with multiple extents: > In udf_read_block() one would have to repeat the mapping from > p_udf->i_position > to i_lba and i_max_size, and to read what is available in the next extent, > ... until the warning case does not apply any more. > > > > > > just wondering, [...] a DVD > > > > that I see that has a bunch of 0 length files located in > > > > what I'd assume to be the location of a longer file. > > I am not sure whether ECMA-167 demands 0-byte data file to have a valid > start LBA. ECMA-119 does (aka ISO 9660). > But if the file has 0 bytes then it might be that the producer simply > decided > to give it some address of a file that has >0 bytes (as usual with > ECMA-119). > > So for now i do not see a connection to the problem of multi-extent files. > > > Have a nice day :) > > Thomas > > >
