[jira] [Commented] (IMAGING-285) Decoding of Rational Numbers broken when large values present

2021-08-09 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396275#comment-17396275
 ] 

Gary Lucas commented on IMAGING-285:


Thank you for identifying this and supplying resources that it to be tested.

I have a fix in for the RationalNumber class that seems to work.  Your 
diagnosis about it being in the ByteConversions class was correct. I've got a 
bit more polishing to do on the code before I submit it and I need to go over 
the case where SRational Number format is used.

Quick question:  Do you have additional resources for testing, or perhaps a 
dump of the elements in the TIFF file?  This is not a feature that I use, so I 
would like to get a bit more extensive tests before I call it "done".

 

Every so often, I have to remind myself that TIFF is a very old format and, in 
fact, dates from before the IEEE-754 standard was universally adopted.  It's 
the only reason I can think of that the Rational Number format would even have 
come into existence.   

> Decoding of Rational Numbers broken when large values present
> -
>
> Key: IMAGING-285
> URL: https://issues.apache.org/jira/browse/IMAGING-285
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: imaging.common.*
>Affects Versions: 1.0-alpha2
>Reporter: John Andrade
>Priority: Major
> Attachments: DJI_0267 cropped.JPG
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Decoding Lat/Long EXIF data from JPEGs does not work for some values.  I have 
> attached a file where the conversion fails.  I used the 
> getLatitudeAsDegreesNorth and getLongitudeAsDegreesEast methods from the 
> TiffImageMetaData.GPSInfo class.  The values are close, but seemingly off by 
> a few miles.
> I've traced the source and I believe the issue is with how the RationalNumber 
> class uses two 32-bit signed integers for the numerator and denominator.  The 
> definition of a RATIONAL data type uses 32-bit unsigned numbers.  It seems as 
> if the RationalNumber class already expects this as it has a non-public 
> static method called factoryMethod to create a RationalNumber from two 64-bit 
> numbers.
> This error is introduced in the ByteConversions class, specifically the 
> toRational method where it uses the regular RationalNumber constructor and 
> thus ensures any rational numbers that have numerator or denominator greater 
> than the max signed 32-bit value will produce invalid values.
> I am attaching a JPEG that has this problem.  I had to crop it to reduce the 
> size, but the EXIF data was preserved.
> The EXIF GPS data contained in the JPEG is:
> GpsLatitudeRef: "N"
> GpsLatitude: 38, 1, 36, 1, 4120083230, 7000
> GpsLongitudeRef: "W"
> GpsLongitude: 90, 1, 12, 1, 2379156648, 7000
> According to the properties of the image (right-clicking on Windows 10), the 
> L/L of the image is:
> Latitude: 38: 36: 58.85833
> Longitude: 90: 12: 33.98795... (Windows doesn't show E/W)
> These values convert to:
> 38.616349536627
> -90.2094410978095
> When I use the getLatitudeAsDegreesNorth  and getLongitudeAsDegreesEast 
> methods, I get the following values:
> 38.5993060156
> -90.19239757679365
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-285) Decoding of Rational Numbers broken when large values present

2021-08-09 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396327#comment-17396327
 ] 

Gary Lucas commented on IMAGING-285:


No.  Thanks for bringing it to my attention.

I was vaguely aware that there was some kind of calving process going on in 
Commons with regard to all things mathematical, but I didn't realize that 
Commons Numbers existed.  

The RationalNumbers handling in the TIFF/EXIF formats is pretty narrow and 
specialized.  So it won't benefit from Commons Numbers, but on quick inspection 
there appears to be a lot of interesting things going on: quaternions, gamma 
functions.  Cool stuff.

 

> Decoding of Rational Numbers broken when large values present
> -
>
> Key: IMAGING-285
> URL: https://issues.apache.org/jira/browse/IMAGING-285
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: imaging.common.*
>Affects Versions: 1.0-alpha2
>Reporter: John Andrade
>Priority: Major
> Attachments: DJI_0267 cropped.JPG
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Decoding Lat/Long EXIF data from JPEGs does not work for some values.  I have 
> attached a file where the conversion fails.  I used the 
> getLatitudeAsDegreesNorth and getLongitudeAsDegreesEast methods from the 
> TiffImageMetaData.GPSInfo class.  The values are close, but seemingly off by 
> a few miles.
> I've traced the source and I believe the issue is with how the RationalNumber 
> class uses two 32-bit signed integers for the numerator and denominator.  The 
> definition of a RATIONAL data type uses 32-bit unsigned numbers.  It seems as 
> if the RationalNumber class already expects this as it has a non-public 
> static method called factoryMethod to create a RationalNumber from two 64-bit 
> numbers.
> This error is introduced in the ByteConversions class, specifically the 
> toRational method where it uses the regular RationalNumber constructor and 
> thus ensures any rational numbers that have numerator or denominator greater 
> than the max signed 32-bit value will produce invalid values.
> I am attaching a JPEG that has this problem.  I had to crop it to reduce the 
> size, but the EXIF data was preserved.
> The EXIF GPS data contained in the JPEG is:
> GpsLatitudeRef: "N"
> GpsLatitude: 38, 1, 36, 1, 4120083230, 7000
> GpsLongitudeRef: "W"
> GpsLongitude: 90, 1, 12, 1, 2379156648, 7000
> According to the properties of the image (right-clicking on Windows 10), the 
> L/L of the image is:
> Latitude: 38: 36: 58.85833
> Longitude: 90: 12: 33.98795... (Windows doesn't show E/W)
> These values convert to:
> 38.616349536627
> -90.2094410978095
> When I use the getLatitudeAsDegreesNorth  and getLongitudeAsDegreesEast 
> methods, I get the following values:
> 38.5993060156
> -90.19239757679365
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-285) Decoding of Rational Numbers broken when large values present

2021-08-10 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396661#comment-17396661
 ] 

Gary Lucas commented on IMAGING-285:


> Is there some particular feature that would prevent that the latter  [Commons 
>Numbers] be a replacement for the former [existing code]?

It's not that at all.  I looked at Commons Numbers, and there's a lot to like 
about it.  I really hope you see widespread adoption.  In the case of Commons 
Imaging, I think that the issue is one of project dependencies. 

I caveat this with saying that I am not in charge of Commons Imaging, so I 
don't speak for the project.  But I am reluctant to add any new dependencies to 
what should be just a single component API for developers to drop into to their 
applications.  Also, Imaging's RationalNumber class is small enough that there 
isn't a strong motivation to depend on an external project even though that 
project would be almost certainly be more carefully written and maintained than 
the single, specialized class in Commons Imaging.

Interestingly, the TIFF standard does not specify the arithmetic to be used for 
rational numbers. It calls for specifying real numbers using two unsigned 
32-bit integers. Let's call them a and b.  So the meaning of the computed 
floating point value (double)a/(double)b is pretty unambiguous.  But the 
Commons Imaging RationalNumber class also has a method that returns the integer 
value a/b.  The TIFF standard doesn't say anything about round-off.  But maybe 
(a+b/2)/b might have been a better solution?  I personally think so, but I am 
not about to mess with the way the code has always worked. At the same time, I 
have to say that one clear advantage of Commons Numbers is that It would "fill 
in the blanks" where there were gaps in the specification. Since Numbers 
specializes in things that Imaging merely uses, I'm sure you guys have worked 
through the details on operations like that. 

 

> Decoding of Rational Numbers broken when large values present
> -
>
> Key: IMAGING-285
> URL: https://issues.apache.org/jira/browse/IMAGING-285
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: imaging.common.*
>Affects Versions: 1.0-alpha2
>Reporter: John Andrade
>Priority: Major
> Attachments: DJI_0267 cropped.JPG
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Decoding Lat/Long EXIF data from JPEGs does not work for some values.  I have 
> attached a file where the conversion fails.  I used the 
> getLatitudeAsDegreesNorth and getLongitudeAsDegreesEast methods from the 
> TiffImageMetaData.GPSInfo class.  The values are close, but seemingly off by 
> a few miles.
> I've traced the source and I believe the issue is with how the RationalNumber 
> class uses two 32-bit signed integers for the numerator and denominator.  The 
> definition of a RATIONAL data type uses 32-bit unsigned numbers.  It seems as 
> if the RationalNumber class already expects this as it has a non-public 
> static method called factoryMethod to create a RationalNumber from two 64-bit 
> numbers.
> This error is introduced in the ByteConversions class, specifically the 
> toRational method where it uses the regular RationalNumber constructor and 
> thus ensures any rational numbers that have numerator or denominator greater 
> than the max signed 32-bit value will produce invalid values.
> I am attaching a JPEG that has this problem.  I had to crop it to reduce the 
> size, but the EXIF data was preserved.
> The EXIF GPS data contained in the JPEG is:
> GpsLatitudeRef: "N"
> GpsLatitude: 38, 1, 36, 1, 4120083230, 7000
> GpsLongitudeRef: "W"
> GpsLongitude: 90, 1, 12, 1, 2379156648, 7000
> According to the properties of the image (right-clicking on Windows 10), the 
> L/L of the image is:
> Latitude: 38: 36: 58.85833
> Longitude: 90: 12: 33.98795... (Windows doesn't show E/W)
> These values convert to:
> 38.616349536627
> -90.2094410978095
> When I use the getLatitudeAsDegreesNorth  and getLongitudeAsDegreesEast 
> methods, I get the following values:
> 38.5993060156
> -90.19239757679365
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-285) Decoding of Rational Numbers broken when large values present

2021-08-10 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396933#comment-17396933
 ] 

Gary Lucas commented on IMAGING-285:


I think that the suggestion of using Commons Numbers really belongs in a 
separate Jira item as an "enhancement".  For now I'm going to limit my changes 
to the issue that I understand and save any undertakings of a broader scope for 
future work.

As an addendum, it turns out that there are some clear pro's and con's for 
Gilles Sadowski's position. I will start with the con.

It turns out that the Commons Numbers API shares the same problem as the 
Commons Imaging code.  The problem is that the TIFF specification for Rational 
elements calls for rational numbers based on two _*unsigned*_ 32-bit integers.  
But both Imaging's current RationalNumber class and Number's Fraction class 
both assume inputs as signed integers.  So computing the latitude using the 
parameters from the sample EXIF data in the original post, we find:

 

 
{code:java}
 int numerator   = -174884066; // 0xf5937b1e
 int denominator = 7000;
 Fraction frac = Fraction.of(numerator, denominator);
 System.out.println("frac="+frac.doubleValue());
 // The "unsigned" approach-
 long     n = numerator  &0xL;
 long     d = denominator&0xL;
 double lat = (double)n/(double)d;
 System.out.println("latitude = "+lat);
Results:
    frac     = -2.4983438
    latitude = 58.85833185714286
{code}
 

 

Thus we see that Numbers has its own variation of the problem.  Numbers does 
have a class called BigInteger which could be used to deal with this. But to me 
it seems a little bit over complicated for this requirement.

On the pro side of Mr. Sadowski's suggestion, it looks like the code in Numbers 
has received a lot more focused development than the code in Imaging's 
RationalNumber class. For example, looking at the code for the Imaging's 
RationalNumber class,  I spotted a shortcoming:
{code:java}
 @Override
 public float floatValue() {
return (float) numerator / (float) divisor;
 }{code}
The shortcoming here is that an IEEE-754 standard 32-bit floating point value 
has only 24 bits of precision in its mantissa while an unsigned integer has 32. 
 So some of that low-order digits in the numerator and denominator could be 
thrown away by the cast even before the division operation is performed.  A 
better solution would be to do things the way Commons Numbers does, which would 
be basically the following:

 
{code:java}
@Override
 public float floatValue() {
 return (float) doubleValue();
 }
@Override
 public double doubleValue() {
 return (double) numerator / (double) divisor;
 }
{code}
 

 

 

> Decoding of Rational Numbers broken when large values present
> -
>
> Key: IMAGING-285
> URL: https://issues.apache.org/jira/browse/IMAGING-285
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: imaging.common.*
>Affects Versions: 1.0-alpha2
>Reporter: John Andrade
>Priority: Major
> Attachments: DJI_0267 cropped.JPG
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Decoding Lat/Long EXIF data from JPEGs does not work for some values.  I have 
> attached a file where the conversion fails.  I used the 
> getLatitudeAsDegreesNorth and getLongitudeAsDegreesEast methods from the 
> TiffImageMetaData.GPSInfo class.  The values are close, but seemingly off by 
> a few miles.
> I've traced the source and I believe the issue is with how the RationalNumber 
> class uses two 32-bit signed integers for the numerator and denominator.  The 
> definition of a RATIONAL data type uses 32-bit unsigned numbers.  It seems as 
> if the RationalNumber class already expects this as it has a non-public 
> static method called factoryMethod to create a RationalNumber from two 64-bit 
> numbers.
> This error is introduced in the ByteConversions class, specifically the 
> toRational method where it uses the regular RationalNumber constructor and 
> thus ensures any rational numbers that have numerator or denominator greater 
> than the max signed 32-bit value will produce invalid values.
> I am attaching a JPEG that has this problem.  I had to crop it to reduce the 
> size, but the EXIF data was preserved.
> The EXIF GPS data contained in the JPEG is:
> GpsLatitudeRef: "N"
> GpsLatitude: 38, 1, 36, 1, 4120083230, 7000
> GpsLongitudeRef: "W"
> GpsLongitude: 90, 1, 12, 1, 2379156648, 7000
> According to the properties of the image (right-clicking on Windows 10), the 
> L/L of the image is:
> Latitude: 38: 36: 58.85833
> Longitude: 90: 12: 33.98795... (Windows doesn't show E/W)
> These values convert to:
> 38.616349536627
> -90.2094410978095
> When I use the getLatitudeAsDegreesNorth  and getLongitudeAsDegreesEast 
> 

[jira] [Commented] (IMAGING-285) Decoding of Rational Numbers broken when large values present

2021-08-11 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397715#comment-17397715
 ] 

Gary Lucas commented on IMAGING-285:


I've got some fixes to the code base to correct the GPS Info behaviors reported 
by this issue.  They involve changes to the RationalNumber class, but also 
involve some changes that have bubbled upward into some of the modules that 
call them including ByteConversions and various Tag-related classes.

 

I should be posting the new logic as soon as I put together some appropriate 
JUnit tests.

> Decoding of Rational Numbers broken when large values present
> -
>
> Key: IMAGING-285
> URL: https://issues.apache.org/jira/browse/IMAGING-285
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: imaging.common.*
>Affects Versions: 1.0-alpha2
>Reporter: John Andrade
>Priority: Major
> Attachments: DJI_0267 cropped.JPG
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Decoding Lat/Long EXIF data from JPEGs does not work for some values.  I have 
> attached a file where the conversion fails.  I used the 
> getLatitudeAsDegreesNorth and getLongitudeAsDegreesEast methods from the 
> TiffImageMetaData.GPSInfo class.  The values are close, but seemingly off by 
> a few miles.
> I've traced the source and I believe the issue is with how the RationalNumber 
> class uses two 32-bit signed integers for the numerator and denominator.  The 
> definition of a RATIONAL data type uses 32-bit unsigned numbers.  It seems as 
> if the RationalNumber class already expects this as it has a non-public 
> static method called factoryMethod to create a RationalNumber from two 64-bit 
> numbers.
> This error is introduced in the ByteConversions class, specifically the 
> toRational method where it uses the regular RationalNumber constructor and 
> thus ensures any rational numbers that have numerator or denominator greater 
> than the max signed 32-bit value will produce invalid values.
> I am attaching a JPEG that has this problem.  I had to crop it to reduce the 
> size, but the EXIF data was preserved.
> The EXIF GPS data contained in the JPEG is:
> GpsLatitudeRef: "N"
> GpsLatitude: 38, 1, 36, 1, 4120083230, 7000
> GpsLongitudeRef: "W"
> GpsLongitude: 90, 1, 12, 1, 2379156648, 7000
> According to the properties of the image (right-clicking on Windows 10), the 
> L/L of the image is:
> Latitude: 38: 36: 58.85833
> Longitude: 90: 12: 33.98795... (Windows doesn't show E/W)
> These values convert to:
> 38.616349536627
> -90.2094410978095
> When I use the getLatitudeAsDegreesNorth  and getLongitudeAsDegreesEast 
> methods, I get the following values:
> 38.5993060156
> -90.19239757679365
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMAGING-311) Read TIFFs with multiple floating-point samples

2021-08-26 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-311:
--

 Summary: Read TIFFs with multiple floating-point samples
 Key: IMAGING-311
 URL: https://issues.apache.org/jira/browse/IMAGING-311
 Project: Commons Imaging
  Issue Type: New Feature
  Components: Format: TIFF
Affects Versions: 1.0-alpha3
 Environment: 
[IMAGING-251|https://issues.apache.org/jira/browse/IMAGING-251]
Reporter: Gary Lucas


I propose to extend Commons Imaging to support reading TIFF files that contain 
floating-point formats that feature more than one sample.  The ability to 
support floating-point samples was introduced in 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-311) Read TIFFs with multiple floating-point samples

2021-08-26 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-311:
---
Description: 
I propose to extend Commons Imaging to support reading TIFF files that contain 
floating-point formats that feature more than one sample.  The ability to 
support floating-point samples was introduced in  
[ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But this 
implementation was limited to only those files that provided a single sample 
per raster cell (a single sample per pixel).

If anyone knows of good sources for test TIFF files that use this format, 
please let me know.

*Background*

In addition to conventional image data, TIFF files can provide floating-point 
numerical information.  This feature is often used for geophysical data (in 
GeoTIFF files), but can also be applied to other uses.  Although the existing 
implementation can support files which give a single value per raster cell, 
there are some products that carry multiple samples per cell. Examples include 
products that give both a measured value and a corresponding accuracy estimate 
(i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical products give 
vectors (gravitational potential, wind vectors, ocean currents, etc.).

Changes would involve extensions to the classes in the TIFF datareader package 
as well as the TiffRasterData class.

  was:I propose to extend Commons Imaging to support reading TIFF files that 
contain floating-point formats that feature more than one sample.  The ability 
to support floating-point samples was introduced in 


> Read TIFFs with multiple floating-point samples
> ---
>
> Key: IMAGING-311
> URL: https://issues.apache.org/jira/browse/IMAGING-311
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
> Environment: 
> [IMAGING-251|https://issues.apache.org/jira/browse/IMAGING-251]
>Reporter: Gary Lucas
>Priority: Major
>
> I propose to extend Commons Imaging to support reading TIFF files that 
> contain floating-point formats that feature more than one sample.  The 
> ability to support floating-point samples was introduced in  
> [ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But this 
> implementation was limited to only those files that provided a single sample 
> per raster cell (a single sample per pixel).
> If anyone knows of good sources for test TIFF files that use this format, 
> please let me know.
> *Background*
> In addition to conventional image data, TIFF files can provide floating-point 
> numerical information.  This feature is often used for geophysical data (in 
> GeoTIFF files), but can also be applied to other uses.  Although the existing 
> implementation can support files which give a single value per raster cell, 
> there are some products that carry multiple samples per cell. Examples 
> include products that give both a measured value and a corresponding accuracy 
> estimate (i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical 
> products give vectors (gravitational potential, wind vectors, ocean currents, 
> etc.).
> Changes would involve extensions to the classes in the TIFF datareader 
> package as well as the TiffRasterData class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-311) Read TIFFs with multiple floating-point samples

2021-08-26 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-311:
---
Description: 
I propose to extend Commons Imaging to support reading TIFF files that contain 
floating-point formats that feature more than one sample.  The ability to 
support floating-point samples was introduced in  
[ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But this 
implementation was limited to only those files that provided a single sample 
per raster cell (e.g. "a single sample per pixel"). The ability to read files 
with multiple samples per raster cell would extend the usefulness of the 
Commons Imaging API, particularly for geophysical applications.

If anyone knows of good sources for test TIFF files that use this format, 
please let me know.

*Background*

In addition to conventional image data, TIFF files can provide floating-point 
numerical information.  This feature is often used for geophysical data (in 
GeoTIFF files), but can also be applied to other uses.  Although the existing 
implementation can support files which give a single value per raster cell, 
there are some products that carry multiple samples per cell. Examples include 
products that give both a measured value and a corresponding accuracy estimate 
(i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical products give 
vectors (gravitational potential, wind vectors, ocean currents, etc.).

Changes would involve extensions to the classes in the TIFF datareader package 
as well as the TiffRasterData class.

  was:
I propose to extend Commons Imaging to support reading TIFF files that contain 
floating-point formats that feature more than one sample.  The ability to 
support floating-point samples was introduced in  
[ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But this 
implementation was limited to only those files that provided a single sample 
per raster cell (a single sample per pixel).

If anyone knows of good sources for test TIFF files that use this format, 
please let me know.

*Background*

In addition to conventional image data, TIFF files can provide floating-point 
numerical information.  This feature is often used for geophysical data (in 
GeoTIFF files), but can also be applied to other uses.  Although the existing 
implementation can support files which give a single value per raster cell, 
there are some products that carry multiple samples per cell. Examples include 
products that give both a measured value and a corresponding accuracy estimate 
(i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical products give 
vectors (gravitational potential, wind vectors, ocean currents, etc.).

Changes would involve extensions to the classes in the TIFF datareader package 
as well as the TiffRasterData class.


> Read TIFFs with multiple floating-point samples
> ---
>
> Key: IMAGING-311
> URL: https://issues.apache.org/jira/browse/IMAGING-311
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
> Environment: 
> [IMAGING-251|https://issues.apache.org/jira/browse/IMAGING-251]
>Reporter: Gary Lucas
>Priority: Major
>
> I propose to extend Commons Imaging to support reading TIFF files that 
> contain floating-point formats that feature more than one sample.  The 
> ability to support floating-point samples was introduced in  
> [ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But this 
> implementation was limited to only those files that provided a single sample 
> per raster cell (e.g. "a single sample per pixel"). The ability to read files 
> with multiple samples per raster cell would extend the usefulness of the 
> Commons Imaging API, particularly for geophysical applications.
> If anyone knows of good sources for test TIFF files that use this format, 
> please let me know.
> *Background*
> In addition to conventional image data, TIFF files can provide floating-point 
> numerical information.  This feature is often used for geophysical data (in 
> GeoTIFF files), but can also be applied to other uses.  Although the existing 
> implementation can support files which give a single value per raster cell, 
> there are some products that carry multiple samples per cell. Examples 
> include products that give both a measured value and a corresponding accuracy 
> estimate (i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical 
> products give vectors (gravitational potential, wind vectors, ocean currents, 
> etc.).
> Changes would involve extensions to the classes in the TIFF datareader 
> package as well as the TiffRasterData class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-311) Read TIFFs with multiple floating-point samples

2021-08-27 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-311:
---
Description: 
I propose to extend Commons Imaging to support reading TIFF files that contain 
floating-point formats that feature more than one sample.  The ability to 
support floating-point samples was introduced in  
[ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But that 
implementation was limited to only those files that provided a single sample 
per raster cell (e.g. "a single sample per pixel"). The ability to read files 
with multiple samples per raster cell would extend the usefulness of the 
Commons Imaging API, particularly for geophysical applications.

If anyone knows of good sources for test TIFF files that use this format, 
please let me know.

*Background*

In addition to conventional image data, TIFF files can provide floating-point 
numerical information.  This feature is often used for geophysical data (in 
GeoTIFF files), but can also be applied to other uses.  Although the existing 
implementation can support files which give a single value per raster cell, 
there are some products that carry multiple samples per cell. Examples include 
products that give both a measured value and a corresponding accuracy estimate 
(i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical products give 
vectors (gravitational potential, wind vectors, ocean currents, etc.).

Changes would involve extensions to the classes in the TIFF datareader package 
as well as the TiffRasterData class.

  was:
I propose to extend Commons Imaging to support reading TIFF files that contain 
floating-point formats that feature more than one sample.  The ability to 
support floating-point samples was introduced in  
[ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But this 
implementation was limited to only those files that provided a single sample 
per raster cell (e.g. "a single sample per pixel"). The ability to read files 
with multiple samples per raster cell would extend the usefulness of the 
Commons Imaging API, particularly for geophysical applications.

If anyone knows of good sources for test TIFF files that use this format, 
please let me know.

*Background*

In addition to conventional image data, TIFF files can provide floating-point 
numerical information.  This feature is often used for geophysical data (in 
GeoTIFF files), but can also be applied to other uses.  Although the existing 
implementation can support files which give a single value per raster cell, 
there are some products that carry multiple samples per cell. Examples include 
products that give both a measured value and a corresponding accuracy estimate 
(i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical products give 
vectors (gravitational potential, wind vectors, ocean currents, etc.).

Changes would involve extensions to the classes in the TIFF datareader package 
as well as the TiffRasterData class.


> Read TIFFs with multiple floating-point samples
> ---
>
> Key: IMAGING-311
> URL: https://issues.apache.org/jira/browse/IMAGING-311
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
> Environment: 
> [IMAGING-251|https://issues.apache.org/jira/browse/IMAGING-251]
>Reporter: Gary Lucas
>Priority: Major
>
> I propose to extend Commons Imaging to support reading TIFF files that 
> contain floating-point formats that feature more than one sample.  The 
> ability to support floating-point samples was introduced in  
> [ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But that 
> implementation was limited to only those files that provided a single sample 
> per raster cell (e.g. "a single sample per pixel"). The ability to read files 
> with multiple samples per raster cell would extend the usefulness of the 
> Commons Imaging API, particularly for geophysical applications.
> If anyone knows of good sources for test TIFF files that use this format, 
> please let me know.
> *Background*
> In addition to conventional image data, TIFF files can provide floating-point 
> numerical information.  This feature is often used for geophysical data (in 
> GeoTIFF files), but can also be applied to other uses.  Although the existing 
> implementation can support files which give a single value per raster cell, 
> there are some products that carry multiple samples per cell. Examples 
> include products that give both a measured value and a corresponding accuracy 
> estimate (i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical 
> products give vectors (gravitational potential, wind vectors, ocean currents, 
> etc.).
> Changes would involve extensions to the classes in the TIFF datareader 

[jira] [Created] (IMAGING-312) Alpha-channel setting not interpreted from ExtraSamples tag

2021-09-03 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-312:
--

 Summary: Alpha-channel setting not interpreted from ExtraSamples 
tag
 Key: IMAGING-312
 URL: https://issues.apache.org/jira/browse/IMAGING-312
 Project: Commons Imaging
  Issue Type: Bug
  Components: Format: TIFF
Affects Versions: 1.0-alpha2
 Environment:  
Reporter: Gary Lucas


Commons Imaging sometimes misinterprets TIFF files that have 4-byte RGB samples 
but do not define alpha.   In some cases, these images are treated as 
semi-transparent when they should be opaque.   Commons Imaging is not unique in 
this regard...  Windows Photo Viewer does the same thing.

The TIFF specification allows RGB images to be encoded with 4-bytes per pixel.  
It would be natural to assume (as Commons Imaging does) that the 4th byte is 
the alpha channel and that it would have values of 0xff in the case where 
pixels were opaque. However, the interpretation of the 4th byte depends on 
information in the TIFF "ExtraSamples" tag. 

It turns out that there are images in-the-wild that use 4 bytes, but populate 
the 4th byte with junk values. For example, there are a number of older aerial 
photographs from the US Geological Survey (USGS) that do this.  These images 
give an ExtraSamples tag with a value of zero.  But the TIFF specification 
calls for images to be treated as having alpha channels only if the 
ExtraSamples field carries a value of either 1 or 2.   When ExtraSamples has a 
value of 0, the 4th byte is to be ignored. 

There are many examples of this phenomenon on the USGS Earth Explorer website. 
One specific example: 

* High Resolution Orthoimagery
* Dataset: 201203_connecticut_state_lot1_ct_0x3000m_utm_cnir
* Entity: 2818289_18TYL425825
* File: 18tyl425825.tif 




*Proposed Fix*
I propose to do the following:
* Extend the TiffImageParser logic for detecting alpha to assume hasAlpha is 
true if and only if the ExtraSamples tag is supplied and contains values 1 or 
2. 
* Provide a hasAlpha accessor for the ImageBuilder class (is should really have 
one anyway)
* Enhance the DataReaderStrips and DataReaderTiles classes to check hasAlpha 
when processing RGB images that have 4 samples per pixel samples. 


*Concerns*
At this time, I am not sure what to do if an RGB TIFF image uses 4-samples per 
pixel but the ExtraSamples tag is not provided.  At this time, I have not seen 
an example of this, but my collection of sample TIFF files is rather narrow and 
I would not rule it out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-312) Alpha-channel setting not interpreted from ExtraSamples tag

2021-09-03 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-312:
---
Description: 
Commons Imaging sometimes misinterprets TIFF files that have 4-byte RGB samples 
but do not define alpha.   In some cases, these images are treated as 
semi-transparent when they should be opaque.   Commons Imaging is not unique in 
this regard...  Windows Photo Viewer does the same thing.

The TIFF specification allows RGB images to be encoded with 4-bytes per pixel.  
It would be natural to assume (as Commons Imaging does) that the 4th byte is 
the alpha channel and that it would have values of 0xff in the case where 
pixels were opaque. However, the interpretation of the 4th byte depends on 
information in the TIFF "ExtraSamples" tag. 

It turns out that there are images in-the-wild that use 4 bytes, but populate 
the 4th byte with junk values. For example, there are a number of older aerial 
photographs from the US Geological Survey (USGS) that do this.  These images 
give an ExtraSamples tag with a value of zero.  But the TIFF specification 
calls for images to be treated as having alpha channels only if the 
ExtraSamples field carries a value of either 1 or 2.   When ExtraSamples has a 
value of 0, the 4th byte is to be ignored. 

So while the USGS TIFF files are in compliance with the TIFF specification, 
they use an unintuitive behavior.  Because the Commons Imaging library assumes 
that the 4th byte would be specified with valid-alpha values, it does not 
render the images correctly.

There are many examples of this phenomenon on the USGS Earth Explorer website. 
One specific example: 

* High Resolution Orthoimagery
* Dataset: 201203_connecticut_state_lot1_ct_0x3000m_utm_cnir
* Entity: 2818289_18TYL425825
* File: 18tyl425825.tif 




*Proposed Fix*
I propose to do the following:
* Extend the TiffImageParser logic for detecting alpha to assume hasAlpha is 
true if and only if the ExtraSamples tag is supplied and contains values 1 or 
2. 
* Provide a hasAlpha accessor for the ImageBuilder class (is should really have 
one anyway)
* Enhance the DataReaderStrips and DataReaderTiles classes to check hasAlpha 
when processing RGB images that have 4 samples per pixel samples. 


*Concerns*
At this time, I am not sure what to do if an RGB TIFF image uses 4-samples per 
pixel but the ExtraSamples tag is not provided.  At this time, I have not seen 
an example of this, but my collection of sample TIFF files is rather narrow and 
I would not rule it out.

  was:
Commons Imaging sometimes misinterprets TIFF files that have 4-byte RGB samples 
but do not define alpha.   In some cases, these images are treated as 
semi-transparent when they should be opaque.   Commons Imaging is not unique in 
this regard...  Windows Photo Viewer does the same thing.

The TIFF specification allows RGB images to be encoded with 4-bytes per pixel.  
It would be natural to assume (as Commons Imaging does) that the 4th byte is 
the alpha channel and that it would have values of 0xff in the case where 
pixels were opaque. However, the interpretation of the 4th byte depends on 
information in the TIFF "ExtraSamples" tag. 

It turns out that there are images in-the-wild that use 4 bytes, but populate 
the 4th byte with junk values. For example, there are a number of older aerial 
photographs from the US Geological Survey (USGS) that do this.  These images 
give an ExtraSamples tag with a value of zero.  But the TIFF specification 
calls for images to be treated as having alpha channels only if the 
ExtraSamples field carries a value of either 1 or 2.   When ExtraSamples has a 
value of 0, the 4th byte is to be ignored. 

There are many examples of this phenomenon on the USGS Earth Explorer website. 
One specific example: 

* High Resolution Orthoimagery
* Dataset: 201203_connecticut_state_lot1_ct_0x3000m_utm_cnir
* Entity: 2818289_18TYL425825
* File: 18tyl425825.tif 




*Proposed Fix*
I propose to do the following:
* Extend the TiffImageParser logic for detecting alpha to assume hasAlpha is 
true if and only if the ExtraSamples tag is supplied and contains values 1 or 
2. 
* Provide a hasAlpha accessor for the ImageBuilder class (is should really have 
one anyway)
* Enhance the DataReaderStrips and DataReaderTiles classes to check hasAlpha 
when processing RGB images that have 4 samples per pixel samples. 


*Concerns*
At this time, I am not sure what to do if an RGB TIFF image uses 4-samples per 
pixel but the ExtraSamples tag is not provided.  At this time, I have not seen 
an example of this, but my collection of sample TIFF files is rather narrow and 
I would not rule it out.


> Alpha-channel setting not interpreted from ExtraSamples tag
> ---
>
> Key: IMAGING-312
> URL: https://issues.apache.org/jira/browse/IMAGING-312
>

[jira] [Updated] (IMAGING-266) Read integer data from GeoTIFFS

2021-09-12 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-266:
---
Attachment: Imaging266_SRTM.jpg

> Read integer data  from GeoTIFFS
> 
>
> Key: IMAGING-266
> URL: https://issues.apache.org/jira/browse/IMAGING-266
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
>Reporter: Gary Lucas
>Assignee: Bruno P. Kinoshita
>Priority: Major
> Fix For: 1.0-alpha3
>
> Attachments: Imaging266_SRTM.jpg
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> I recently discovered that there is a large amount of digital elevation data 
> available in the form of 16-bit integer coded data in GeoTIFF files (TIFF 
> files with geographic tags).  I propose to enhance the Commons Imaging API to 
> read these files.  This work will be similar to the work I did for reading 
> floating-point raster data under ISSUE-251.
> Available data include the nearly-global coverage of one-second of arc 
> elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
> other sources. These products give grids of elevation data with a 30 meter 
> cell spacing for most of the world's land masses. They are available at NASA 
> Earthdata and Japan Space Systems websites, see 
> [https://asterweb.jpl.nasa.gov/gdem.asp|https://asterweb.jpl.nasa.gov/gdem.asp]
>  There is also a ocean bathymetry data set available in this format at 
> [http://www.shadedrelief.com/blue-earth/]
> This new feature will continue to expand the usefulness of the Commons 
> Imaging API in accessing GeoTIFF products.
> Request for Feedback
> So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
> elevation and ocean depth data in meters recorded as a short integer.  I 
> haven't found an example of where the 32-bit integer format is used.  For 
> now, I am planning on only coding the 16-bit integer variation.  Does anyone 
> know if the 32-bit version is worth supporting?  My criteria for determining 
> that would be based on whether there is a significant number of projects 
> using that format (life is too short to chase rarely used data formats).
> Currently, one of the code-analysis operations conducted by the Commons 
> Imaging build process is coverage by JUnit tests.  Lacking any test data for 
> the 32-bit case, I am reluctant to include it in the code base because it 
> would mean putting uncovered code into the distribution.
> Also, I am wondering about the best design for the access API.  The current 
> TiffImageParser class has a method called getFloatingPointRasterData() that 
> returns an instance of TiffRasterData.  TiffRasterData is pretty much 
> hard-wired to floating point data.  I am thinking of creating a new method 
> called getIntegerRasterData() that would return an instance of a new class 
> called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
> to combine both kinds of results into a unified class and method, but it 
> seems more unwieldy than useful. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-266) Read integer data from GeoTIFFS

2021-09-12 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413674#comment-17413674
 ] 

Gary Lucas commented on IMAGING-266:


Well, it took a couple of tries, but I have just submitted changes to address 
this issue. This new feature allows developers to use the Commons Imaging API 
to access the extensive set of high-resolution, global elevation data available 
through the Shuttle Radar Topography Mission (SRTM) as well as many other 
geophysical data sources.

Here's an example image produced from a TIFF file that contains elevation data 
for a 1-degree square in the vicinity of Auckland, New Zealand. The original 
TIFF file gives a 3601-by-3601 grid of elevation data points at a 1 second of 
arc spacing (about 30 meters). This image shown below is reduced to 25 percent 
of the original size.

!Imaging266_SRTM.jpg!

 

The image above was produced using a modified version of my DemoCOG 
application.  I've written a web article describing the techniques used for 
rendering numerical data from GeoTIFFs.  If you are interested, you can read it 
at [https://gwlucastrig.github.io/gridfour/notes/ElevationGeoTiff1.html]

An earlier version of the Imaging API implemented a method called 
getFloatingPointRasterData() to read data from TIFF files that contained 
floating-point data. However, the elevation data in SRTM products is stored in 
a short-integer format. So the access method has been changed to handle both 
floating-point and integral data types and its name has been changed to 
getRasterData().

The example code below shows how to use the API:
{code:java}
import java.io.File;
import org.apache.commons.imaging.FormatCompliance;
import org.apache.commons.imaging.common.bytesource.ByteSourceFile;
import org.apache.commons.imaging.formats.tiff.TiffContents;
import org.apache.commons.imaging.formats.tiff.TiffDirectory;
import org.apache.commons.imaging.formats.tiff.TiffRasterData;
import org.apache.commons.imaging.formats.tiff.TiffRasterDataType;
import org.apache.commons.imaging.formats.tiff.TiffReader;


public class ExampleRasterReader {
public static void main(String[] args) throws Exception {
File target = new File(args[0]);
ByteSourceFile byteSource = new ByteSourceFile(target);

// Establish a TiffReader. This is just a simple constructor that
// does not actually access the file.  So the application cannot
// obtain the byteOrder, or other details, until the contents has
// been read.
TiffReader tiffReader = new TiffReader(true);

// Read the directories in the TIFF file.  Directories are the
// main data element of a TIFF file. They usually include an image
// element, but sometimes just carry metadata. This example
// reads all the directories in the file.   Typically, reading
// the directories is not a time-consuming operation.
TiffContents contents = tiffReader.readDirectories(
byteSource,
true, // indicates that application should read image data, if 
present
FormatCompliance.getDefault());

// Read the first Image File Directory (IFD) in the file.  A practical
// implementation could use any of the directories in the file.
// By convention, the main payload (image or raster data) is
// stored in the first TIFF directory with optional thumbnail images
// or metadata directories to follow.
TiffDirectory directory = contents.directories.get(0);

// Check that the first directory in the file has raster data.
if (!directory.hasTiffRasterData()) {
System.out.println(
"Specified directory does not contain raster data");
System.exit(-1);
}

// read all the raster data for the first directory.
// The return value may carry short integers or single-precision
// floating point values.  The optional parameter, which is set to
// null in this example, could be used to fetch only a subset
// of the overall data.
TiffRasterData raster = directory.getRasterData(null);

// THe data type may be integral or floating point.
TiffRasterDataType rasterDataType = raster.getDataType();
System.out.println("Data type for raster: " + rasterDataType.name());
int width = raster.getWidth();
int height = raster.getHeight();

int nSamples = 0;
double sumSamples = 0;
if (rasterDataType == TiffRasterDataType.INTEGER) {
// data type is integral, so we use getIntValue()
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
int sample = raster.getIntValue(x, y);
if (sample > 0) {
nSamples++;
  

[jira] [Updated] (IMAGING-312) Alpha-channel setting not interpreted from ExtraSamples tag

2021-09-12 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-312:
---
Attachment: Imaging312.png

> Alpha-channel setting not interpreted from ExtraSamples tag
> ---
>
> Key: IMAGING-312
> URL: https://issues.apache.org/jira/browse/IMAGING-312
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha2
> Environment:  
>Reporter: Gary Lucas
>Priority: Major
> Attachments: Imaging312.png
>
>
> Commons Imaging sometimes misinterprets TIFF files that have 4-byte RGB 
> samples but do not define alpha.   In some cases, these images are treated as 
> semi-transparent when they should be opaque.   Commons Imaging is not unique 
> in this regard...  Windows Photo Viewer does the same thing.
> The TIFF specification allows RGB images to be encoded with 4-bytes per 
> pixel.  It would be natural to assume (as Commons Imaging does) that the 4th 
> byte is the alpha channel and that it would have values of 0xff in the case 
> where pixels were opaque. However, the interpretation of the 4th byte depends 
> on information in the TIFF "ExtraSamples" tag. 
> It turns out that there are images in-the-wild that use 4 bytes, but populate 
> the 4th byte with junk values. For example, there are a number of older 
> aerial photographs from the US Geological Survey (USGS) that do this.  These 
> images give an ExtraSamples tag with a value of zero.  But the TIFF 
> specification calls for images to be treated as having alpha channels only if 
> the ExtraSamples field carries a value of either 1 or 2.   When ExtraSamples 
> has a value of 0, the 4th byte is to be ignored. 
> So while the USGS TIFF files are in compliance with the TIFF specification, 
> they use an unintuitive behavior.  Because the Commons Imaging library 
> assumes that the 4th byte would be specified with valid-alpha values, it does 
> not render the images correctly.
> There are many examples of this phenomenon on the USGS Earth Explorer 
> website. One specific example: 
> * High Resolution Orthoimagery
> * Dataset: 201203_connecticut_state_lot1_ct_0x3000m_utm_cnir
> * Entity: 2818289_18TYL425825
> * File: 18tyl425825.tif 
> *Proposed Fix*
> I propose to do the following:
> * Extend the TiffImageParser logic for detecting alpha to assume hasAlpha is 
> true if and only if the ExtraSamples tag is supplied and contains values 1 or 
> 2. 
> * Provide a hasAlpha accessor for the ImageBuilder class (is should really 
> have one anyway)
> * Enhance the DataReaderStrips and DataReaderTiles classes to check hasAlpha 
> when processing RGB images that have 4 samples per pixel samples. 
> *Concerns*
> At this time, I am not sure what to do if an RGB TIFF image uses 4-samples 
> per pixel but the ExtraSamples tag is not provided.  At this time, I have not 
> seen an example of this, but my collection of sample TIFF files is rather 
> narrow and I would not rule it out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-312) Alpha-channel setting not interpreted from ExtraSamples tag

2021-09-12 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413677#comment-17413677
 ] 

Gary Lucas commented on IMAGING-312:


Here's an example comparing the image cited above as rendered by Windows Photo 
Viewer and the new version of the Commons Imaging API that I will be submitting 
in a couple of days.   You can see the effect of misinterpreting the 4th byte 
for each pixel as giving alpha values.  In reality, the metadata in the TIFF 
file indicates that the 4th byte should be ignored.

 

!Imaging312.png!   

> Alpha-channel setting not interpreted from ExtraSamples tag
> ---
>
> Key: IMAGING-312
> URL: https://issues.apache.org/jira/browse/IMAGING-312
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha2
> Environment:  
>Reporter: Gary Lucas
>Priority: Major
> Attachments: Imaging312.png
>
>
> Commons Imaging sometimes misinterprets TIFF files that have 4-byte RGB 
> samples but do not define alpha.   In some cases, these images are treated as 
> semi-transparent when they should be opaque.   Commons Imaging is not unique 
> in this regard...  Windows Photo Viewer does the same thing.
> The TIFF specification allows RGB images to be encoded with 4-bytes per 
> pixel.  It would be natural to assume (as Commons Imaging does) that the 4th 
> byte is the alpha channel and that it would have values of 0xff in the case 
> where pixels were opaque. However, the interpretation of the 4th byte depends 
> on information in the TIFF "ExtraSamples" tag. 
> It turns out that there are images in-the-wild that use 4 bytes, but populate 
> the 4th byte with junk values. For example, there are a number of older 
> aerial photographs from the US Geological Survey (USGS) that do this.  These 
> images give an ExtraSamples tag with a value of zero.  But the TIFF 
> specification calls for images to be treated as having alpha channels only if 
> the ExtraSamples field carries a value of either 1 or 2.   When ExtraSamples 
> has a value of 0, the 4th byte is to be ignored. 
> So while the USGS TIFF files are in compliance with the TIFF specification, 
> they use an unintuitive behavior.  Because the Commons Imaging library 
> assumes that the 4th byte would be specified with valid-alpha values, it does 
> not render the images correctly.
> There are many examples of this phenomenon on the USGS Earth Explorer 
> website. One specific example: 
> * High Resolution Orthoimagery
> * Dataset: 201203_connecticut_state_lot1_ct_0x3000m_utm_cnir
> * Entity: 2818289_18TYL425825
> * File: 18tyl425825.tif 
> *Proposed Fix*
> I propose to do the following:
> * Extend the TiffImageParser logic for detecting alpha to assume hasAlpha is 
> true if and only if the ExtraSamples tag is supplied and contains values 1 or 
> 2. 
> * Provide a hasAlpha accessor for the ImageBuilder class (is should really 
> have one anyway)
> * Enhance the DataReaderStrips and DataReaderTiles classes to check hasAlpha 
> when processing RGB images that have 4 samples per pixel samples. 
> *Concerns*
> At this time, I am not sure what to do if an RGB TIFF image uses 4-samples 
> per pixel but the ExtraSamples tag is not provided.  At this time, I have not 
> seen an example of this, but my collection of sample TIFF files is rather 
> narrow and I would not rule it out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-311) Read TIFFs with multiple floating-point samples

2021-10-08 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-311:
---
Attachment: NBS_Example.jpg

> Read TIFFs with multiple floating-point samples
> ---
>
> Key: IMAGING-311
> URL: https://issues.apache.org/jira/browse/IMAGING-311
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
> Environment: 
> [IMAGING-251|https://issues.apache.org/jira/browse/IMAGING-251]
>Reporter: Gary Lucas
>Priority: Major
> Attachments: NBS_Example.jpg
>
>
> I propose to extend Commons Imaging to support reading TIFF files that 
> contain floating-point formats that feature more than one sample.  The 
> ability to support floating-point samples was introduced in  
> [ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But that 
> implementation was limited to only those files that provided a single sample 
> per raster cell (e.g. "a single sample per pixel"). The ability to read files 
> with multiple samples per raster cell would extend the usefulness of the 
> Commons Imaging API, particularly for geophysical applications.
> If anyone knows of good sources for test TIFF files that use this format, 
> please let me know.
> *Background*
> In addition to conventional image data, TIFF files can provide floating-point 
> numerical information.  This feature is often used for geophysical data (in 
> GeoTIFF files), but can also be applied to other uses.  Although the existing 
> implementation can support files which give a single value per raster cell, 
> there are some products that carry multiple samples per cell. Examples 
> include products that give both a measured value and a corresponding accuracy 
> estimate (i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical 
> products give vectors (gravitational potential, wind vectors, ocean currents, 
> etc.).
> Changes would involve extensions to the classes in the TIFF datareader 
> package as well as the TiffRasterData class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-311) Read TIFFs with multiple floating-point samples

2021-10-08 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426357#comment-17426357
 ] 

Gary Lucas commented on IMAGING-311:


I've made some progress on this issue and have submitted a pull-request to 
Commons Imaging.  A bit more work is still required (mostly additional JUnit 
tests and a few missing accessor methods).

This change will enable Imaging to support additional data sets such as the 
National Bathymetry System GeoTIFF raster products.  These products give ocean 
depth as a combination of depth and accuracy.  The attached picture illustrates 
the concept.  The terrestrial features are from work that was done for 
Imaging-251, but the oceanographic information comes from multi-variable 
GeoTIFFs that were not previously accessible by the Imaging API (or any other 
Java API that I know of).

 

!NBS_Example.jpg!

 

I built the imagery using a combination of Commons Imaging, some open-source 
software I wrote to demonstrate [Shaded Relief Rendering 
Techniques,|https://gwlucastrig.github.io/gridfour/notes/ElevationGeoTiff1.html]
 and some geographic mapping modules I developed for my employer's Java-based  
[wXstation|http://www.sonalysts.com/products/wxstation/] product.

 

 

> Read TIFFs with multiple floating-point samples
> ---
>
> Key: IMAGING-311
> URL: https://issues.apache.org/jira/browse/IMAGING-311
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
> Environment: 
> [IMAGING-251|https://issues.apache.org/jira/browse/IMAGING-251]
>Reporter: Gary Lucas
>Priority: Major
> Attachments: NBS_Example.jpg
>
>
> I propose to extend Commons Imaging to support reading TIFF files that 
> contain floating-point formats that feature more than one sample.  The 
> ability to support floating-point samples was introduced in  
> [ISSUE-251|https://issues.apache.org/jira/browse/IMAGING-251].  But that 
> implementation was limited to only those files that provided a single sample 
> per raster cell (e.g. "a single sample per pixel"). The ability to read files 
> with multiple samples per raster cell would extend the usefulness of the 
> Commons Imaging API, particularly for geophysical applications.
> If anyone knows of good sources for test TIFF files that use this format, 
> please let me know.
> *Background*
> In addition to conventional image data, TIFF files can provide floating-point 
> numerical information.  This feature is often used for geophysical data (in 
> GeoTIFF files), but can also be applied to other uses.  Although the existing 
> implementation can support files which give a single value per raster cell, 
> there are some products that carry multiple samples per cell. Examples 
> include products that give both a measured value and a corresponding accuracy 
> estimate (i.e. 245.6 meters plus or minus 0.5 meters). Some geophysical 
> products give vectors (gravitational potential, wind vectors, ocean currents, 
> etc.).
> Changes would involve extensions to the classes in the TIFF datareader 
> package as well as the TiffRasterData class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-127) API to get a single image should allow choosing which image

2021-10-12 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427769#comment-17427769
 ] 

Gary Lucas commented on IMAGING-127:


I'm don't know what the level of interest is for this issue.   But I do know of 
an example application in the "test" hierarchy that shows one way to extract 
images and metadata. It uses low-level calls and is a bit specialized for 
general application work.  It might help get you started on your own 
implementation.

Directory:

  commons-imaging-master\src\test\java\org\apache\commons\imaging\examples\tiff

File

ReadTagsAndImages.java

 

> API to get a single image should allow choosing which image
> ---
>
> Key: IMAGING-127
> URL: https://issues.apache.org/jira/browse/IMAGING-127
> Project: Commons Imaging
>  Issue Type: Improvement
>Reporter: Trejkaz
>Priority: Major
> Fix For: Patch Needed
>
> Attachments: 2472527552.gif, Wakarusa2015-0001.mpo, june 1 part I.tif
>
>
> getBufferedImage() only returns the first image. There should be a way to 
> retrieve any image by index (and by extension, an API to get the image count.)
> getBufferedImages() cannot be used for large multi-page TIFF files, because 
> creating that many BufferedImage objects causes an OutOfMemoryError.
> (For that matter, a method to get a scaled down copy would be useful as well, 
> as some formats can optimise that not to retrieve all the data, but also it 
> means you can reduce the memory usage for absolutely massive images.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-194) Tiff with JPEG,Zip compression fails to decompress

2021-10-13 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428230#comment-17428230
 ] 

Gary Lucas commented on IMAGING-194:


Does anyone have some manageable-sized, public domain files that we can use to 
develop/test this capability?

> Tiff with JPEG,Zip compression fails to decompress
> --
>
> Key: IMAGING-194
> URL: https://issues.apache.org/jira/browse/IMAGING-194
> Project: Commons Imaging
>  Issue Type: Improvement
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha1
>Reporter: Satya Deep Maheshwari
>Priority: Major
>
> Tiff with JPEG, Zip compression  fails to decompress with the below exception:
> {code}
> org.apache.commons.imaging.ImageReadException: Tiff: unknown/unsupported 
> compression: 7
>   at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReader.decompress(DataReader.java:215)
>   at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.readImageData(DataReaderStrips.java:210)
>   at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:650)
>   at 
> org.apache.commons.imaging.formats.tiff.TiffDirectory.getTiffImage(TiffDirectory.java:157)
>   at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:463)
>   at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1407)
>   at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1370)
> {code}
> From the 
> [documentation|https://commons.apache.org/proper/commons-imaging/formatsupport.html]
>  , it seems this compression format is not supported. Excerpt from the 
> document below:
> {quote}
> Supported through version 6.0. TIFFs is a open-ended container format, so 
> it's not possible to support every possibly variation. Supports Bi-Level, 
> Palette/Indexed, RGB, CMYK, YCbCr, CIELab and LOGLUV images. Supports reading 
> and writing LZW, CCITT Modified Huffman/Group 3/Group 4, and Packbits/RLE 
> compression. Notably missing other forms of compression though, including 
> JPEG. Supports reading Tiled images.
> {quote}
> This ticket is logged to add JPEG/Zip compression format support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-194) Tiff with JPEG,Zip compression fails to decompress

2021-10-13 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428230#comment-17428230
 ] 

Gary Lucas edited comment on IMAGING-194 at 10/13/21, 2:27 PM:
---

Does anyone have some manageable-sized, public domain files that we can use to 
develop/test this capability?

Does anyone have a current need for this feature?

Through my job, I've got some interest in accessing JPEG based TIFFs.  The U.S. 
National Weather Service has started posting some satellite images as GeoTIFFs 
containing JPEG-formatted data. I will be investigating the level-of-effort for 
implementing the ability to read such files. But the task may be out-of-scope 
for what I am willing to take on unless there are a significant number of 
people interested in the feature.

You can find the satellite images of interest at 

[GOES-East - Latest CONUS Images - NOAA / NESDIS / 
STAR|https://www.star.nesdis.noaa.gov/GOES/conus.php?sat=G16]

and 

[GOES Imagery Viewer - NOAA / NESDIS / 
STAR|https://www.star.nesdis.noaa.gov/GOES/]

Some of the satellite-derived imagery is really quite beautiful.


was (Author: gwlucas):
Does anyone have some manageable-sized, public domain files that we can use to 
develop/test this capability?

> Tiff with JPEG,Zip compression fails to decompress
> --
>
> Key: IMAGING-194
> URL: https://issues.apache.org/jira/browse/IMAGING-194
> Project: Commons Imaging
>  Issue Type: Improvement
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha1
>Reporter: Satya Deep Maheshwari
>Priority: Major
>
> Tiff with JPEG, Zip compression  fails to decompress with the below exception:
> {code}
> org.apache.commons.imaging.ImageReadException: Tiff: unknown/unsupported 
> compression: 7
>   at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReader.decompress(DataReader.java:215)
>   at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.readImageData(DataReaderStrips.java:210)
>   at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:650)
>   at 
> org.apache.commons.imaging.formats.tiff.TiffDirectory.getTiffImage(TiffDirectory.java:157)
>   at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:463)
>   at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1407)
>   at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1370)
> {code}
> From the 
> [documentation|https://commons.apache.org/proper/commons-imaging/formatsupport.html]
>  , it seems this compression format is not supported. Excerpt from the 
> document below:
> {quote}
> Supported through version 6.0. TIFFs is a open-ended container format, so 
> it's not possible to support every possibly variation. Supports Bi-Level, 
> Palette/Indexed, RGB, CMYK, YCbCr, CIELab and LOGLUV images. Supports reading 
> and writing LZW, CCITT Modified Huffman/Group 3/Group 4, and Packbits/RLE 
> compression. Notably missing other forms of compression though, including 
> JPEG. Supports reading Tiled images.
> {quote}
> This ticket is logged to add JPEG/Zip compression format support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMAGING-313) Provide summary of GeoTIFF tags in example TIFF-dump application

2021-10-17 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-313:
--

 Summary: Provide summary of GeoTIFF tags in example TIFF-dump 
application
 Key: IMAGING-313
 URL: https://issues.apache.org/jira/browse/IMAGING-313
 Project: Commons Imaging
  Issue Type: New Feature
  Components: Format: TIFF
Affects Versions: 1.0-alpha3
Reporter: Gary Lucas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-313) Provide summary of GeoTIFF tags in example TIFF-dump application

2021-10-17 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429778#comment-17429778
 ] 

Gary Lucas commented on IMAGING-313:


The existing Apache Commons Imaging distribution includes an example 
application that opens a TIFF image file and extracts the metadata ("tags" in 
the TIFF parlance) for inspection.  I propose to extend the utility to include 
a summary of GeoTIFF-specific tags.

GeoTIFFs are an important class of TIFF files that are used to show imagery 
that has a geographic basis. They include Satellite images, aerial photographs, 
digitized maps, and even some numerical data such as elevations. The GeoTIFF 
standard is well-known, though slightly complex specification.

The current example application, ReadTagsAndImages.java, prints all tags in the 
TIFF file, but the GeoTIFF related information is presented as abstract 
numerical values. I propose to add logic to the tag-reading application to 
format some of that GeoTIFF information as human-friendly strings. 

Here's an example of a numeric data file giving high-resolution elevation data 
(some tags omitted).  The GeoKeyDirectoryTag is essentially a dictionary giving 
a guide to the content to follow.  In this case, it consists of 36 integer 
value.   In the Summary of GeoTIFF Elements, the proposed application 
interprets those integer constants as named strings.
{code:java}
Directory  0 Numeric raster data, description: Root
 256 (0x100: ImageWidth): 10812 (1 Short)
 257 (0x101: ImageLength): 10812 (1 Short)
 [snip]
 34735 (0x87af: GeoKeyDirectoryTag): 1, 1, 0, 8, 1024, 0, 1, 2, 1025, 0, 1, 1, 
(36 elements)
 34737 (0x87b1: GeoAsciiParamsTag): 'NAD83|' (7 ASCII)
 42112 (0xa480: GDALMetadata): '
  NONE Provide summary of GeoTIFF tags in example TIFF-dump application
> 
>
> Key: IMAGING-313
> URL: https://issues.apache.org/jira/browse/IMAGING-313
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
>Reporter: Gary Lucas
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-313) Provide summary of GeoTIFF tags in example TIFF-dump application

2021-10-17 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429778#comment-17429778
 ] 

Gary Lucas edited comment on IMAGING-313 at 10/17/21, 11:32 PM:


The existing Apache Commons Imaging distribution includes an example 
application that opens a TIFF image file and extracts the metadata ("tags" in 
the TIFF parlance) for inspection.  I propose to extend the utility to include 
a summary of GeoTIFF-specific tags.

GeoTIFFs are an important class of TIFF files that are used to show imagery 
that has a geographic basis. They include Satellite images, aerial photographs, 
digitized maps, and even some numerical data such as elevations. The GeoTIFF 
standard is well-known, though slightly complex specification.

The current example application, ReadTagsAndImages.java, prints all tags in the 
TIFF file, but the GeoTIFF related information is presented as abstract 
numerical values. I propose to add logic to the tag-reading application to 
format some of that GeoTIFF information as human-friendly strings. 

Here's an example of a numeric data file giving high-resolution elevation data 
(some tags omitted).  The GeoKeyDirectoryTag is essentially a dictionary giving 
a guide to the content to follow.  In this case, it consists of 36 integer 
value.   In the Summary of GeoTIFF Elements, the proposed application 
interprets those integer constants as named strings.
{code:java}
Directory  0 Numeric raster data, description: Root
 256 (0x100: ImageWidth): 10812 (1 Short)
 257 (0x101: ImageLength): 10812 (1 Short)
 [snip]
 34735 (0x87af: GeoKeyDirectoryTag): 1, 1, 0, 8, 1024, 0, 1, 2, 1025, 0, 1, 1, 
(36 elements)
 34737 (0x87b1: GeoAsciiParamsTag): 'NAD83|' (7 ASCII)
 42112 (0xa480: GDALMetadata): '
  NONE
  NONE Provide summary of GeoTIFF tags in example TIFF-dump application
> 
>
> Key: IMAGING-313
> URL: https://issues.apache.org/jira/browse/IMAGING-313
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
>Reporter: Gary Lucas
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-313) Provide summary of GeoTIFF tags in example TIFF-dump application

2021-10-17 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429778#comment-17429778
 ] 

Gary Lucas edited comment on IMAGING-313 at 10/17/21, 11:33 PM:


The existing Apache Commons Imaging distribution includes an example 
application that opens a TIFF image file and extracts the metadata ("tags" in 
the TIFF parlance) for inspection.  I propose to extend the utility to include 
a summary of GeoTIFF-specific tags.

GeoTIFFs are an important class of TIFF files that are used to show imagery 
that has a geographic basis. They include Satellite images, aerial photographs, 
digitized maps, and even some numerical data such as elevations. The GeoTIFF 
standard is well-known, though slightly complex specification.

The current example application, ReadTagsAndImages.java, prints all tags in the 
TIFF file, but the GeoTIFF related information is presented as abstract 
numerical values. I propose to add logic to the tag-reading application to 
format some of that GeoTIFF information as human-friendly strings. 

Here's an example of a numeric data file giving high-resolution elevation data 
(some tags omitted).  The GeoKeyDirectoryTag is essentially a dictionary giving 
a guide to the content to follow.  In this case, it consists of 36 integer 
value.   In the Summary of GeoTIFF Elements, the proposed application 
interprets those integer constants as named strings.
{code:java}
Directory  0 Numeric raster data, description: Root
 256 (0x100: ImageWidth): 10812 (1 Short)
 257 (0x101: ImageLength): 10812 (1 Short)
 [snip]
 34735 (0x87af: GeoKeyDirectoryTag): 1, 1, 0, 8, 1024, 0, 1, 2, 1025, 0, 1, 1, 
(36 elements)
 34737 (0x87b1: GeoAsciiParamsTag): 'NAD83|' (7 ASCII)
 42112 (0xa480: GDALMetadata): '
  NONE
  NONE Provide summary of GeoTIFF tags in example TIFF-dump application
> 
>
> Key: IMAGING-313
> URL: https://issues.apache.org/jira/browse/IMAGING-313
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
>Reporter: Gary Lucas
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-313) Provide summary of GeoTIFF tags in example TIFF-dump application

2021-10-17 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429778#comment-17429778
 ] 

Gary Lucas edited comment on IMAGING-313 at 10/17/21, 11:35 PM:


The existing Apache Commons Imaging distribution includes an example 
application that opens a TIFF image file and extracts the metadata ("tags" in 
the TIFF parlance) for inspection.  I propose to extend the utility to include 
a summary of GeoTIFF-specific tags.

GeoTIFFs are an important class of TIFF files that are used to show imagery 
that has a geographic basis. They include Satellite images, aerial photographs, 
digitized maps, and even some numerical data such as elevations. The GeoTIFF 
standard is a well-known, though slightly complex specification.

The current example application, ReadTagsAndImages.java, prints all tags in the 
TIFF file, but the GeoTIFF related information is presented as abstract 
numerical values. I propose to add logic to the tag-reading application to 
format some of that GeoTIFF information as human-friendly strings. 

Here's an example of a numeric data file giving high-resolution elevation data 
(some tags omitted).  The GeoKeyDirectoryTag is essentially a dictionary giving 
a guide to the content to follow.  In this case, it consists of 36 integer 
value.   In the Summary of GeoTIFF Elements, the proposed application 
interprets those integer constants as named strings.
{code:java}
Directory  0 Numeric raster data, description: Root
 256 (0x100: ImageWidth): 10812 (1 Short)
 257 (0x101: ImageLength): 10812 (1 Short)
 [snip]
 34735 (0x87af: GeoKeyDirectoryTag): 1, 1, 0, 8, 1024, 0, 1, 2, 1025, 0, 1, 1, 
(36 elements)
 34737 (0x87b1: GeoAsciiParamsTag): 'NAD83|' (7 ASCII)
 42112 (0xa480: GDALMetadata): '
  NONE
  NONE Provide summary of GeoTIFF tags in example TIFF-dump application
> 
>
> Key: IMAGING-313
> URL: https://issues.apache.org/jira/browse/IMAGING-313
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
>Reporter: Gary Lucas
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-313) Provide summary of GeoTIFF tags in example TIFF-dump application

2021-10-17 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429778#comment-17429778
 ] 

Gary Lucas edited comment on IMAGING-313 at 10/17/21, 11:36 PM:


The existing Apache Commons Imaging distribution includes an example 
application that opens a TIFF image file and extracts the metadata ("tags" in 
the TIFF parlance) for inspection.  I propose to extend the utility to include 
a summary of GeoTIFF-specific tags.

GeoTIFFs are an important class of TIFF files that are used to show imagery 
that has a geographic basis. They include Satellite images, aerial photographs, 
digitized maps, and even some numerical data such as elevations. The GeoTIFF 
standard is a well-known, though slightly complex specification.

The current example application, ReadTagsAndImages.java, prints all tags in the 
TIFF file, but the GeoTIFF related information is presented as abstract 
numerical values. You can interpret these values if you have a copy of the 
GeoTIFF documentation handy, but it is a tedious process. I propose to add 
logic to the tag-reading application to format some of that GeoTIFF information 
as human-friendly strings. 

Here's an example of a numeric data file giving high-resolution elevation data 
(some tags omitted).  The GeoKeyDirectoryTag is essentially a dictionary giving 
a guide to the content to follow.  In this case, it consists of 36 integer 
value.   In the Summary of GeoTIFF Elements, the proposed application 
interprets those integer constants as named strings.
{code:java}
Directory  0 Numeric raster data, description: Root
 256 (0x100: ImageWidth): 10812 (1 Short)
 257 (0x101: ImageLength): 10812 (1 Short)
 [snip]
 34735 (0x87af: GeoKeyDirectoryTag): 1, 1, 0, 8, 1024, 0, 1, 2, 1025, 0, 1, 1, 
(36 elements)
 34737 (0x87b1: GeoAsciiParamsTag): 'NAD83|' (7 ASCII)
 42112 (0xa480: GDALMetadata): '
  NONE
  NONE Provide summary of GeoTIFF tags in example TIFF-dump application
> 
>
> Key: IMAGING-313
> URL: https://issues.apache.org/jira/browse/IMAGING-313
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
>Reporter: Gary Lucas
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMAGING-316) Support the BigTIFF file format

2021-11-02 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-316:
--

 Summary: Support the BigTIFF file format
 Key: IMAGING-316
 URL: https://issues.apache.org/jira/browse/IMAGING-316
 Project: Commons Imaging
  Issue Type: New Feature
Affects Versions: 1.x
Reporter: Gary Lucas


Traditional TIFF files address file position in bytes using 32-bit integers.  
This approach automatically limits the maximum size of a TIFF file to 4 GB.  
The BigTIFF specification (formalized in 2011) uses 64-bit integers to address 
file positions, and thus supports much larger files.  I propose that a future 
release of Commons Imaging would benefit from supporting BigTIFF.

The level of effort for this implementation may be large. 

In terms of creating JUnit tests to support this effort, note that just because 
a file uses the BigTIFF specification doesn't mean that the file has to be 
super large. It should be possible to create BigTIFF test files that are only a 
few kilobytes.  Thus supporting BigTIFF does not necessarily mean that massive 
files will need to be included in the Commons Imaging distribution.

P.S.

It might be work investigating whether the existing Imaging library actually 
supports the full 32-bit address space of a conventional TIFF.  Regrettably,  
Java doesn't support unsigned integer types.  And it is possible that a file 
address with the high bit set might be incorrectly interpreted as a negative 
number.  So I will be taking a look at the code to make sure all file addresses 
are properly masked when they are handed over to Java.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-316) Support the BigTIFF file format

2021-11-02 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-316:
---
Description: 
Traditional TIFF files address file position in bytes using 32-bit integers.  
This approach automatically limits the maximum size of a TIFF file to 4 GB.  
The BigTIFF specification (formalized in 2011) uses 64-bit integers to address 
file positions, and thus supports much larger files.  I propose that a future 
release of Commons Imaging would benefit from supporting BigTIFF.

The level of effort for this implementation may be large. 

In terms of creating JUnit tests to support this effort, note that just because 
a file uses the BigTIFF specification doesn't mean that the file has to be 
super large. It should be possible to create BigTIFF test files that are only a 
few kilobytes.  Thus supporting BigTIFF does not necessarily mean that massive 
files will need to be included in the Commons Imaging distribution.

Finally, it is reasonable to ask if anyone would actually need images that were 
so large that they couldn't fit within 4 GB.   The short answer is that some 
folks in the Geographic Information Systems (GIS) community do work with images 
(or data sets) that large and, also, that some systems produce images in 
BigTIFF format even when ordinary TIFF would suffice.

 

P.S. It might be work investigating whether the existing Imaging library 
actually supports the full 32-bit address space of a conventional TIFF.  
Regrettably,  Java doesn't support unsigned integer types.  And it is possible 
that a file address with the high bit set might be incorrectly interpreted as a 
negative number.  So I will be taking a look at the code to make sure all file 
addresses are properly masked when they are handed over to Java.

 

 

  was:
Traditional TIFF files address file position in bytes using 32-bit integers.  
This approach automatically limits the maximum size of a TIFF file to 4 GB.  
The BigTIFF specification (formalized in 2011) uses 64-bit integers to address 
file positions, and thus supports much larger files.  I propose that a future 
release of Commons Imaging would benefit from supporting BigTIFF.

The level of effort for this implementation may be large. 

In terms of creating JUnit tests to support this effort, note that just because 
a file uses the BigTIFF specification doesn't mean that the file has to be 
super large. It should be possible to create BigTIFF test files that are only a 
few kilobytes.  Thus supporting BigTIFF does not necessarily mean that massive 
files will need to be included in the Commons Imaging distribution.

P.S.

It might be work investigating whether the existing Imaging library actually 
supports the full 32-bit address space of a conventional TIFF.  Regrettably,  
Java doesn't support unsigned integer types.  And it is possible that a file 
address with the high bit set might be incorrectly interpreted as a negative 
number.  So I will be taking a look at the code to make sure all file addresses 
are properly masked when they are handed over to Java.

 

 


> Support the BigTIFF file format
> ---
>
> Key: IMAGING-316
> URL: https://issues.apache.org/jira/browse/IMAGING-316
> Project: Commons Imaging
>  Issue Type: New Feature
>Affects Versions: 1.x
>Reporter: Gary Lucas
>Priority: Major
>
> Traditional TIFF files address file position in bytes using 32-bit integers.  
> This approach automatically limits the maximum size of a TIFF file to 4 GB.  
> The BigTIFF specification (formalized in 2011) uses 64-bit integers to 
> address file positions, and thus supports much larger files.  I propose that 
> a future release of Commons Imaging would benefit from supporting BigTIFF.
> The level of effort for this implementation may be large. 
> In terms of creating JUnit tests to support this effort, note that just 
> because a file uses the BigTIFF specification doesn't mean that the file has 
> to be super large. It should be possible to create BigTIFF test files that 
> are only a few kilobytes.  Thus supporting BigTIFF does not necessarily mean 
> that massive files will need to be included in the Commons Imaging 
> distribution.
> Finally, it is reasonable to ask if anyone would actually need images that 
> were so large that they couldn't fit within 4 GB.   The short answer is that 
> some folks in the Geographic Information Systems (GIS) community do work with 
> images (or data sets) that large and, also, that some systems produce images 
> in BigTIFF format even when ordinary TIFF would suffice.
>  
> P.S. It might be work investigating whether the existing Imaging library 
> actually supports the full 32-bit address space of a conventional TIFF.  
> Regrettably,  Java doesn't support unsigned integer types.  And it is 
> possible that a fil

[jira] [Commented] (IMAGING-316) Support the BigTIFF file format

2021-11-02 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437495#comment-17437495
 ] 

Gary Lucas commented on IMAGING-316:


Short answer, the BigTIFF specification simply calls for changes in the way 
file-positions are specified in a file.  It doesn't change the internal 
representation of content.  A TIFF file can contain multiple bundled images, 
and their individual formats would not change.

However, to provide a bit of background... TIFF actually is an image file 
specification.  And it also supports numeric grids (i.e. Earth surface 
elevation data sets).  TIFF can support data using various data compression 
methods, of which JPEG is just one.  In the case of TIFF, the file does not so 
much contain a JPEG image as it contains an image and uses JPEG-based data 
compression to store it.  And, again, all images in a TIFF file (whether one or 
many) are all in a TIFF-specified data format.

As an aside, right now there is a Jira item in place to upgrade Commons Imaging 
to handle TIFF files that contain JPEG-style images.  I wish it was as simple 
as pumping the content through a conventional JPEG API.  Although the TIFF 
format does call for JPEG methods to be used on TIFF files, the internal 
representation is different enough from the JPEG standard to mess things up.  I 
have only just started looking at that one and have no idea what the level of 
effort is going to be.  But, in answer to your question, I believe that 
particular issue would be independent of BigTIFF.

> Support the BigTIFF file format
> ---
>
> Key: IMAGING-316
> URL: https://issues.apache.org/jira/browse/IMAGING-316
> Project: Commons Imaging
>  Issue Type: New Feature
>Affects Versions: 1.x
>Reporter: Gary Lucas
>Priority: Major
>
> Traditional TIFF files address file position in bytes using 32-bit integers.  
> This approach automatically limits the maximum size of a TIFF file to 4 GB.  
> The BigTIFF specification (formalized in 2011) uses 64-bit integers to 
> address file positions, and thus supports much larger files.  I propose that 
> a future release of Commons Imaging would benefit from supporting BigTIFF.
> The level of effort for this implementation may be large. 
> In terms of creating JUnit tests to support this effort, note that just 
> because a file uses the BigTIFF specification doesn't mean that the file has 
> to be super large. It should be possible to create BigTIFF test files that 
> are only a few kilobytes.  Thus supporting BigTIFF does not necessarily mean 
> that massive files will need to be included in the Commons Imaging 
> distribution.
> Finally, it is reasonable to ask if anyone would actually need images that 
> were so large that they couldn't fit within 4 GB.   The short answer is that 
> some folks in the Geographic Information Systems (GIS) community do work with 
> images (or data sets) that large and, also, that some systems produce images 
> in BigTIFF format even when ordinary TIFF would suffice.
>  
> P.S. It might be work investigating whether the existing Imaging library 
> actually supports the full 32-bit address space of a conventional TIFF.  
> Regrettably,  Java doesn't support unsigned integer types.  And it is 
> possible that a file address with the high bit set might be incorrectly 
> interpreted as a negative number.  So I will be taking a look at the code to 
> make sure all file addresses are properly masked when they are handed over to 
> Java.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-316) Support the BigTIFF file format

2021-11-02 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437495#comment-17437495
 ] 

Gary Lucas edited comment on IMAGING-316 at 11/2/21, 6:04 PM:
--

Short answer, the BigTIFF specification simply calls for changes in the way 
file-positions are specified in a file.  It doesn't change the internal 
representation of content.  A TIFF file can contain multiple bundled images, 
and their individual formats would not change.

However, to provide a bit of background... TIFF actually is an image file 
specification.  And it also supports numeric grids (i.e. Earth surface 
elevation data sets).  TIFF can support data using various data compression 
methods, of which JPEG is just one.  In the case of TIFF, the file does not so 
much contain a JPEG image as it contains an image and uses JPEG-based data 
compression to store it.  And, again, all images in a TIFF file (whether one or 
many) are all in a TIFF-specified data format.

As an aside, right now there is a Jira item in place to upgrade Commons Imaging 
to handle TIFF files that contain JPEG-style images, see [Imaging-194 
|https://issues.apache.org/jira/browse/IMAGING-194  I wish it was as simple as 
pumping the content through a conventional JPEG API.  Although the TIFF format 
does call for JPEG methods to be used on TIFF files, the internal 
representation is different enough from the JPEG standard to mess things up.  I 
have only just started looking at that one and have no idea what the level of 
effort is going to be.  But, in answer to your question, I believe that 
particular issue would be independent of BigTIFF.


was (Author: gwlucas):
Short answer, the BigTIFF specification simply calls for changes in the way 
file-positions are specified in a file.  It doesn't change the internal 
representation of content.  A TIFF file can contain multiple bundled images, 
and their individual formats would not change.

However, to provide a bit of background... TIFF actually is an image file 
specification.  And it also supports numeric grids (i.e. Earth surface 
elevation data sets).  TIFF can support data using various data compression 
methods, of which JPEG is just one.  In the case of TIFF, the file does not so 
much contain a JPEG image as it contains an image and uses JPEG-based data 
compression to store it.  And, again, all images in a TIFF file (whether one or 
many) are all in a TIFF-specified data format.

As an aside, right now there is a Jira item in place to upgrade Commons Imaging 
to handle TIFF files that contain JPEG-style images.  I wish it was as simple 
as pumping the content through a conventional JPEG API.  Although the TIFF 
format does call for JPEG methods to be used on TIFF files, the internal 
representation is different enough from the JPEG standard to mess things up.  I 
have only just started looking at that one and have no idea what the level of 
effort is going to be.  But, in answer to your question, I believe that 
particular issue would be independent of BigTIFF.

> Support the BigTIFF file format
> ---
>
> Key: IMAGING-316
> URL: https://issues.apache.org/jira/browse/IMAGING-316
> Project: Commons Imaging
>  Issue Type: New Feature
>Affects Versions: 1.x
>Reporter: Gary Lucas
>Priority: Major
>
> Traditional TIFF files address file position in bytes using 32-bit integers.  
> This approach automatically limits the maximum size of a TIFF file to 4 GB.  
> The BigTIFF specification (formalized in 2011) uses 64-bit integers to 
> address file positions, and thus supports much larger files.  I propose that 
> a future release of Commons Imaging would benefit from supporting BigTIFF.
> The level of effort for this implementation may be large. 
> In terms of creating JUnit tests to support this effort, note that just 
> because a file uses the BigTIFF specification doesn't mean that the file has 
> to be super large. It should be possible to create BigTIFF test files that 
> are only a few kilobytes.  Thus supporting BigTIFF does not necessarily mean 
> that massive files will need to be included in the Commons Imaging 
> distribution.
> Finally, it is reasonable to ask if anyone would actually need images that 
> were so large that they couldn't fit within 4 GB.   The short answer is that 
> some folks in the Geographic Information Systems (GIS) community do work with 
> images (or data sets) that large and, also, that some systems produce images 
> in BigTIFF format even when ordinary TIFF would suffice.
>  
> P.S. It might be work investigating whether the existing Imaging library 
> actually supports the full 32-bit address space of a conventional TIFF.  
> Regrettably,  Java doesn't support unsigned integer types.  And it is 
> possible that a file address with the hi

[jira] [Comment Edited] (IMAGING-316) Support the BigTIFF file format

2021-11-02 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437495#comment-17437495
 ] 

Gary Lucas edited comment on IMAGING-316 at 11/2/21, 6:06 PM:
--

Short answer, the BigTIFF specification simply calls for changes in the way 
file-positions are specified in a file.  It doesn't change the internal 
representation of content.  A TIFF file can contain multiple bundled images, 
and their individual formats would not change.

However, to provide a bit of background... TIFF actually is an image file 
specification.  And it also supports numeric grids (i.e. Earth surface 
elevation data sets).  TIFF can support data using various data compression 
methods, of which JPEG is just one.  In the case of TIFF, the file does not so 
much contain a JPEG image as it contains an image and uses JPEG-based data 
compression to store it.  And, again, all images in a TIFF file (whether one or 
many) are all in a TIFF-specified data format.

As an aside, right now there is a Jira item in place to upgrade Commons Imaging 
to handle TIFF files that contain JPEG-style images, see 
[Imaging-194|https://issues.apache.org/jira/browse/IMAGING-194] I wish it was 
as simple as pumping the content through a conventional JPEG API.  Although the 
TIFF format does call for JPEG methods to be used on TIFF files, the internal 
representation is different enough from the JPEG standard to mess things up.  I 
have only just started looking at that one and have no idea what the level of 
effort is going to be.  But, in answer to your question, I believe that 
particular issue would be independent of BigTIFF.


was (Author: gwlucas):
Short answer, the BigTIFF specification simply calls for changes in the way 
file-positions are specified in a file.  It doesn't change the internal 
representation of content.  A TIFF file can contain multiple bundled images, 
and their individual formats would not change.

However, to provide a bit of background... TIFF actually is an image file 
specification.  And it also supports numeric grids (i.e. Earth surface 
elevation data sets).  TIFF can support data using various data compression 
methods, of which JPEG is just one.  In the case of TIFF, the file does not so 
much contain a JPEG image as it contains an image and uses JPEG-based data 
compression to store it.  And, again, all images in a TIFF file (whether one or 
many) are all in a TIFF-specified data format.

As an aside, right now there is a Jira item in place to upgrade Commons Imaging 
to handle TIFF files that contain JPEG-style images, see [Imaging-194 
|https://issues.apache.org/jira/browse/IMAGING-194  I wish it was as simple as 
pumping the content through a conventional JPEG API.  Although the TIFF format 
does call for JPEG methods to be used on TIFF files, the internal 
representation is different enough from the JPEG standard to mess things up.  I 
have only just started looking at that one and have no idea what the level of 
effort is going to be.  But, in answer to your question, I believe that 
particular issue would be independent of BigTIFF.

> Support the BigTIFF file format
> ---
>
> Key: IMAGING-316
> URL: https://issues.apache.org/jira/browse/IMAGING-316
> Project: Commons Imaging
>  Issue Type: New Feature
>Affects Versions: 1.x
>Reporter: Gary Lucas
>Priority: Major
>
> Traditional TIFF files address file position in bytes using 32-bit integers.  
> This approach automatically limits the maximum size of a TIFF file to 4 GB.  
> The BigTIFF specification (formalized in 2011) uses 64-bit integers to 
> address file positions, and thus supports much larger files.  I propose that 
> a future release of Commons Imaging would benefit from supporting BigTIFF.
> The level of effort for this implementation may be large. 
> In terms of creating JUnit tests to support this effort, note that just 
> because a file uses the BigTIFF specification doesn't mean that the file has 
> to be super large. It should be possible to create BigTIFF test files that 
> are only a few kilobytes.  Thus supporting BigTIFF does not necessarily mean 
> that massive files will need to be included in the Commons Imaging 
> distribution.
> Finally, it is reasonable to ask if anyone would actually need images that 
> were so large that they couldn't fit within 4 GB.   The short answer is that 
> some folks in the Geographic Information Systems (GIS) community do work with 
> images (or data sets) that large and, also, that some systems produce images 
> in BigTIFF format even when ordinary TIFF would suffice.
>  
> P.S. It might be work investigating whether the existing Imaging library 
> actually supports the full 32-bit address space of a conventional TIFF.  
> Regrettably,  Java doesn't support unsigned i

[jira] [Created] (IMAGING-320) Read TIFFs with 32-bit integer samples

2022-01-07 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-320:
--

 Summary: Read TIFFs with 32-bit integer samples
 Key: IMAGING-320
 URL: https://issues.apache.org/jira/browse/IMAGING-320
 Project: Commons Imaging
  Issue Type: New Feature
  Components: Format: TIFF
Reporter: Gary Lucas


Issue 266 added the ability to read numerical data from TIFF files that gave 
16-bit integer samples. This feature addressed data products such as the 
Shuttle Radar Topography Mission (SRTM) which provided high-resolution 
terrestrial elevation data. Recently, I encountered an elevation product that 
used 32-bit integer samples.  I propose to enhance the numerical-data functions 
to read 32-bit data.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (IMAGING-327) Rename setExif method in TiffImagingParameters

2022-01-30 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-327:
--

 Summary: Rename setExif method in TiffImagingParameters
 Key: IMAGING-327
 URL: https://issues.apache.org/jira/browse/IMAGING-327
 Project: Commons Imaging
  Issue Type: Improvement
  Components: Format: TIFF
Reporter: Gary Lucas






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (IMAGING-327) Rename setExif method in TiffImagingParameters

2022-01-30 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17484408#comment-17484408
 ] 

Gary Lucas commented on IMAGING-327:


The new TiffImagingParameters class offers a welcome improvement over the 
previous API, but I suggest that we should rename one of the methods to better 
reflect what it does.

When creating a new TIFF file, an application passes in a new TiffDirectory 
using an instance of TiffOutputSet.

 
{code:java}
      TiffOutputSet tiffOutputSet = new TiffOutputSet();
      tiffOutputSet.addDirectory(tiffDirectory);
      TiffImagingParameters tiffParams = new TiffImagingParameters();
      tiffParams.setExif(tiffOutputSet); {code}
I propose to change the name of the setExif() method to be setOutputSet().  
This change will affect some of the unit tests as well as the main API.   
Although output sets may include TIFF tags related to the EXIF standard, not 
all output sets are EXIF data.

I believe that the setExif() method owes its name to the code used in the 
legacy implementation and the API that has now been replaced.  SInce we are 
improving the API, it makes sense to change it.  Here's an example of the 
legacy version:
{code:java}
      TiffOutputSet tiffOutputSet = new TiffOutputSet();
      tiffOutputSet.addDirectory(tiffDirectory);
      params.put(ImagingConstants.PARAM_KEY_EXIF, tiffOutputSet); {code}
 

> Rename setExif method in TiffImagingParameters
> --
>
> Key: IMAGING-327
> URL: https://issues.apache.org/jira/browse/IMAGING-327
> Project: Commons Imaging
>  Issue Type: Improvement
>  Components: Format: TIFF
>Reporter: Gary Lucas
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (IMAGING-319) updateExifMetadataLossless lost the first character of a String

2022-02-11 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490983#comment-17490983
 ] 

Gary Lucas commented on IMAGING-319:


I haven't figured this out yet, but I've narrowed down the cause to 
TiffImageWriterLossless.  There is a method that attempts to update the file 
positions (offsets) of the various EXIF tags.   It's called 
updateOffsetsSteps().

The code is very confusing.  As far as I can tell, at some point some of the 
space in the output file is determined to be "unused" and updateOffsetSteps() 
attempts to reuse it by finding available space and setting the tag output 
position to the available space.  If you bypass this operation by adding a 
diagnostic call to  unusedElements.clear() right after the unusedElements list 
is established, everything works fine. 

The call stack is basically

ExifRewriter.updateExifMetadataLossles

ExifRewriter.writeExifSegment

TiffImageWriterLossless.write

TiffImageWriterLossless.updateOffsetsStep

 

 

 

> updateExifMetadataLossless lost the first character of a String
> ---
>
> Key: IMAGING-319
> URL: https://issues.apache.org/jira/browse/IMAGING-319
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: JPEG
>Affects Versions: 1.0-alpha2
>Reporter: Sicheng Yang
>Priority: Major
> Attachments: Screen Shot 2021-11-26 at 4.01.06 PM-1.png, Screen Shot 
> 2021-11-26 at 4.01.21 PM-1.png, iPhone12-geotag.JPG
>
>
> I try to use TiffOutputSet to generate a new image. However, if a tag that 
> contains String, the program may miss the first character of the String.
>  
> import java.io.*;
> import org.apache.commons.imaging.ImageReadException;
> import org.apache.commons.imaging.ImageWriteException;
> import org.apache.commons.imaging.Imaging;
> import org.apache.commons.imaging.common.ImageMetadata;
> import org.apache.commons.imaging.formats.jpeg.JpegImageMetadata;
> import org.apache.commons.imaging.formats.jpeg.exif.ExifRewriter;
> import org.apache.commons.imaging.formats.tiff.TiffImageMetadata;
> import org.apache.commons.imaging.formats.tiff.write.TiffOutputSet;
> public class LibraryTest {
>     public static void main(String[] args) throws ImageReadException, 
> IOException, ImageWriteException {
>         File source = new File("./assets/iPhone12-geotag.JPG");
>         File result = new 
> File("./assets/results/editted-iPhone12-geotag.JPG");
>         final ImageMetadata metadata = Imaging.getMetadata(source);
>         final JpegImageMetadata jpegMetadata = (JpegImageMetadata) metadata;
>         final TiffImageMetadata exif = jpegMetadata.getExif();
>         TiffOutputSet outputSet = exif.getOutputSet();
>         BufferedOutputStream bufferedOutputStream = new 
> BufferedOutputStream(new FileOutputStream(result));
>         new ExifRewriter().updateExifMetadataLossless(source, 
> bufferedOutputStream, outputSet);
>     }
> }
>  
> This is the sample code.
> Tag value in original image
> !image-2021-11-26-16-01-58-645.png!
> Tag value in output image
> !image-2021-11-26-16-04-12-185.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (IMAGING-319) updateExifMetadataLossless lost the first character of a String

2022-02-11 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491005#comment-17491005
 ] 

Gary Lucas commented on IMAGING-319:


Okay, found it.

In the code below, the method looped through all the available free elements 
and found one it calls "bestFit".  It is going to store the new data into the 
available space.  But TIFF files have a rule that the offsets have to be an 
even multiple of 2.  So there's a check to see if the offset is odd and, if it 
is, the code advances the offset forward one.  The problem is that it doesn't 
recognize that by advancing the offset, it's reduced the amount of available 
space (bestFit.length).  So, the "excessLength" computation below will be 
incorrect.  If some subsequent element is an exact match for the incorrect 
excessLength value, it will overwrite the unused space and clobber whatever 
follows.   In this case, the thing that got clobbered was the first byte of 
EXIF tag 0x9010.

The probability of this happening is small, but non zero.  It is just luck that 
Sicheng Yang's data sample triggered the issue.

 
{quote}               long offset = bestFit.offset;
                if ((offset & 1L) != 0) {
                    offset += 1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (bestFit.length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = bestFit.offset + outputItemLength;
                    final int excessLength = bestFit.length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedElements);
                }
            }
{quote}
 

I re-wrote the code as follows.  It works.  Writing a JUnit test for this is 
going to be extremely difficult.

 
{quote}               unusedElements.remove(bestFit);
                long offset = bestFit.offset;
                int length = bestFit.length;
                if ((offset & 1L) != 0) {
                    // offsets have to be at a multiple of 2
                    offset += 1;
                    length -=1;
                }
                outputItem.setOffset(offset);
              

                if (length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = offset + outputItemLength;
                    final int excessLength = length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedElements);
                }
{quote}

> updateExifMetadataLossless lost the first character of a String
> ---
>
> Key: IMAGING-319
> URL: https://issues.apache.org/jira/browse/IMAGING-319
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: JPEG
>Affects Versions: 1.0-alpha2
>Reporter: Sicheng Yang
>Priority: Major
> Attachments: Screen Shot 2021-11-26 at 4.01.06 PM-1.png, Screen Shot 
> 2021-11-26 at 4.01.21 PM-1.png, iPhone12-geotag.JPG
>
>
> I try to use TiffOutputSet to generate a new image. However, if a tag that 
> contains String, the program may miss the first character of the String.
>  
> import java.io.*;
> import org.apache.commons.imaging.ImageReadException;
> import org.apache.commons.imaging.ImageWriteException;
> import org.apache.commons.imaging.Imaging;
> import org.apache.commons.imaging.common.ImageMetadata;
> import org.apache.commons.imaging.formats.jpeg.JpegImageMetadata;
> import org.apache.commons.imaging.formats.jpeg.exif.ExifRewriter;
> import org.apache.commons.imaging.formats.tiff.TiffImageMetadata;
> import org.apache.commons.imaging.formats.tiff.write.TiffOutputSet;
> public class LibraryTest {
>     public static void main(String[] args) throws ImageReadException, 
> IOException, ImageWriteException {
>         File source = new File("./assets/iPhone12-geotag.JPG");
>         File result = new 
> File("./assets/results/editted-iPhone12-geotag.JPG");
>         final ImageMetadata metadata = Imaging.getMetadata(source);
>         final JpegImageMetadata jpegMetadata = (JpegImageMetadata) metadata;
>         final TiffImageMetadata exif = jpegMetadata.getExif();
>         TiffOutputSet outputSet = exif.getOutputSet();
>         BufferedOutputStream bufferedOutputStream 

[jira] [Comment Edited] (IMAGING-319) updateExifMetadataLossless lost the first character of a String

2022-02-11 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491005#comment-17491005
 ] 

Gary Lucas edited comment on IMAGING-319 at 2/11/22, 4:06 PM:
--

Okay, found it.

In the code below, the method looped through all the available free elements 
and found one it calls "bestFit".  It is going to store the new data into the 
available space.  But TIFF files have a rule that the offsets have to be an 
even multiple of 2.  So there's a check to see if the offset is odd and, if it 
is, the code advances the offset forward one.  The problem is that it doesn't 
recognize that by advancing the offset, it's reduced the amount of available 
space (bestFit.length).  So, the "excessLength" computation below will be 
incorrect.  If some subsequent element is an exact match for the incorrect 
excessLength value, it will overwrite the unused space and clobber whatever 
follows.   In this case, the thing that got clobbered was the first byte of 
EXIF tag 0x9010.

The probability of this happening is small, but non zero.  It is just luck that 
Sicheng Yang's data sample triggered the issue.

 

    long offset = bestFit.offset;
                if ((offset & 1L) != 0) {
                    offset += 1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (bestFit.length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = bestFit.offset + outputItemLength;
                    final int excessLength = bestFit.length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedElements);
                }

I re-wrote the code as follows.  It works.  Writing a JUnit test for this is 
going to be extremely difficult unless we want to include the large test file 
in our code distribution (which I think would be a bad idea).

                long offset = bestFit.offset;
                int length = bestFit.length;
                if ((offset & 1L) != 0) {
                    // offsets have to be at a multiple of 2
                    offset += 1;
                    length -=1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = offset + outputItemLength;
                    final int excessLength = length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedElements);
                }


was (Author: gwlucas):
Okay, found it.

In the code below, the method looped through all the available free elements 
and found one it calls "bestFit".  It is going to store the new data into the 
available space.  But TIFF files have a rule that the offsets have to be an 
even multiple of 2.  So there's a check to see if the offset is odd and, if it 
is, the code advances the offset forward one.  The problem is that it doesn't 
recognize that by advancing the offset, it's reduced the amount of available 
space (bestFit.length).  So, the "excessLength" computation below will be 
incorrect.  If some subsequent element is an exact match for the incorrect 
excessLength value, it will overwrite the unused space and clobber whatever 
follows.   In this case, the thing that got clobbered was the first byte of 
EXIF tag 0x9010.

The probability of this happening is small, but non zero.  It is just luck that 
Sicheng Yang's data sample triggered the issue.

 
{quote}               long offset = bestFit.offset;
                if ((offset & 1L) != 0) {
                    offset += 1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (bestFit.length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = bestFit.offset + outputItemLength;
                    final int excessLength = bestFit.length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedEle

[jira] [Comment Edited] (IMAGING-319) updateExifMetadataLossless lost the first character of a String

2022-02-11 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491005#comment-17491005
 ] 

Gary Lucas edited comment on IMAGING-319 at 2/11/22, 4:07 PM:
--

Okay, found it.

In the code below, the method looped through all the available free elements 
and found one it calls "bestFit".  It is going to store the new data into the 
available space.  But TIFF files have a rule that the offsets have to be an 
even multiple of 2.  So there's a check to see if the offset is odd and, if it 
is, the code advances the offset forward one.  The problem is that it doesn't 
recognize that by advancing the offset, it's reduced the amount of available 
space (bestFit.length).  So, the "excessLength" computation below will be 
incorrect.  If some subsequent element is an exact match for the incorrect 
excessLength value, it will overwrite the unused space and clobber whatever 
follows.   In this case, the thing that got clobbered was the first byte of 
EXIF tag 0x9010.

The probability of this happening is small, but non zero.  It is just luck that 
Sicheng Yang's data sample triggered the issue.

 
{code:java}
long offset = bestFit.offset;
                if ((offset & 1L) != 0) {
                    offset += 1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (bestFit.length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = bestFit.offset + outputItemLength;
                    final int excessLength = bestFit.length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedElements);
                }

{code}


   
I re-wrote the code as follows.  It works.  Writing a JUnit test for this is 
going to be extremely difficult unless we want to include the large test file 
in our code distribution (which I think would be a bad idea).


{code:java}
                long offset = bestFit.offset;
                int length = bestFit.length;
                if ((offset & 1L) != 0) {
                    // offsets have to be at a multiple of 2
                    offset += 1;
                    length -=1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = offset + outputItemLength;
                    final int excessLength = length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedElements);
                }
{code}



was (Author: gwlucas):
Okay, found it.

In the code below, the method looped through all the available free elements 
and found one it calls "bestFit".  It is going to store the new data into the 
available space.  But TIFF files have a rule that the offsets have to be an 
even multiple of 2.  So there's a check to see if the offset is odd and, if it 
is, the code advances the offset forward one.  The problem is that it doesn't 
recognize that by advancing the offset, it's reduced the amount of available 
space (bestFit.length).  So, the "excessLength" computation below will be 
incorrect.  If some subsequent element is an exact match for the incorrect 
excessLength value, it will overwrite the unused space and clobber whatever 
follows.   In this case, the thing that got clobbered was the first byte of 
EXIF tag 0x9010.

The probability of this happening is small, but non zero.  It is just luck that 
Sicheng Yang's data sample triggered the issue.

 

    long offset = bestFit.offset;
                if ((offset & 1L) != 0) {
                    offset += 1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (bestFit.length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = bestFit.offset + outputItemLength;
                    final int excessLength = bestFit.length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
         

[jira] [Comment Edited] (IMAGING-319) updateExifMetadataLossless lost the first character of a String

2022-02-11 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490983#comment-17490983
 ] 

Gary Lucas edited comment on IMAGING-319 at 2/11/22, 4:09 PM:
--

I haven't figured this out yet, but I've narrowed down the cause to 
TiffImageWriterLossless.  There is a method that attempts to update the file 
positions (offsets) of the various EXIF tags.   It's called 
updateOffsetsSteps().

The code is very confusing.  As far as I can tell, at some point some of the 
space in the output file is determined to be "unused" and updateOffsetSteps() 
attempts to reuse it by finding available space and setting the tag output 
position to the available space.  If you bypass this operation by adding a 
diagnostic call to  unusedElements.clear() right after the unusedElements list 
is established, everything works fine. 

 

 

 


was (Author: gwlucas):
I haven't figured this out yet, but I've narrowed down the cause to 
TiffImageWriterLossless.  There is a method that attempts to update the file 
positions (offsets) of the various EXIF tags.   It's called 
updateOffsetsSteps().

The code is very confusing.  As far as I can tell, at some point some of the 
space in the output file is determined to be "unused" and updateOffsetSteps() 
attempts to reuse it by finding available space and setting the tag output 
position to the available space.  If you bypass this operation by adding a 
diagnostic call to  unusedElements.clear() right after the unusedElements list 
is established, everything works fine. 

The call stack is basically

ExifRewriter.updateExifMetadataLossles

ExifRewriter.writeExifSegment

TiffImageWriterLossless.write

TiffImageWriterLossless.updateOffsetsStep

 

 

 

> updateExifMetadataLossless lost the first character of a String
> ---
>
> Key: IMAGING-319
> URL: https://issues.apache.org/jira/browse/IMAGING-319
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: JPEG
>Affects Versions: 1.0-alpha2
>Reporter: Sicheng Yang
>Priority: Major
> Attachments: Screen Shot 2021-11-26 at 4.01.06 PM-1.png, Screen Shot 
> 2021-11-26 at 4.01.21 PM-1.png, iPhone12-geotag.JPG
>
>
> I try to use TiffOutputSet to generate a new image. However, if a tag that 
> contains String, the program may miss the first character of the String.
>  
> import java.io.*;
> import org.apache.commons.imaging.ImageReadException;
> import org.apache.commons.imaging.ImageWriteException;
> import org.apache.commons.imaging.Imaging;
> import org.apache.commons.imaging.common.ImageMetadata;
> import org.apache.commons.imaging.formats.jpeg.JpegImageMetadata;
> import org.apache.commons.imaging.formats.jpeg.exif.ExifRewriter;
> import org.apache.commons.imaging.formats.tiff.TiffImageMetadata;
> import org.apache.commons.imaging.formats.tiff.write.TiffOutputSet;
> public class LibraryTest {
>     public static void main(String[] args) throws ImageReadException, 
> IOException, ImageWriteException {
>         File source = new File("./assets/iPhone12-geotag.JPG");
>         File result = new 
> File("./assets/results/editted-iPhone12-geotag.JPG");
>         final ImageMetadata metadata = Imaging.getMetadata(source);
>         final JpegImageMetadata jpegMetadata = (JpegImageMetadata) metadata;
>         final TiffImageMetadata exif = jpegMetadata.getExif();
>         TiffOutputSet outputSet = exif.getOutputSet();
>         BufferedOutputStream bufferedOutputStream = new 
> BufferedOutputStream(new FileOutputStream(result));
>         new ExifRewriter().updateExifMetadataLossless(source, 
> bufferedOutputStream, outputSet);
>     }
> }
>  
> This is the sample code.
> Tag value in original image
> !image-2021-11-26-16-01-58-645.png!
> Tag value in output image
> !image-2021-11-26-16-04-12-185.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (IMAGING-319) updateExifMetadataLossless lost the first character of a String

2022-02-11 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491005#comment-17491005
 ] 

Gary Lucas edited comment on IMAGING-319 at 2/11/22, 4:09 PM:
--

Okay, found it.

In the code below, the method looped through all the available free elements 
and found one it calls "bestFit".  It is going to store the new data into the 
available space.  But TIFF files have a rule that the offsets have to be an 
even multiple of 2.  So there's a check to see if the offset is odd and, if it 
is, the code advances the offset forward one.  The problem is that it doesn't 
recognize that by advancing the offset, it's reduced the amount of available 
space (bestFit.length).  So, the "excessLength" computation below will be 
incorrect.  If some subsequent element is an exact match for the incorrect 
excessLength value, it will overwrite the unused space and clobber whatever 
follows.   In this case, the thing that got clobbered was the first byte of 
EXIF tag 0x9010.

The probability of this happening is small, but non-zero.  It is just luck that 
Sicheng Yang's data sample triggered the issue.

 
{code:java}
long offset = bestFit.offset;
                if ((offset & 1L) != 0) {
                    offset += 1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (bestFit.length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = bestFit.offset + outputItemLength;
                    final int excessLength = bestFit.length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedElements);
                }

{code}
   
I re-wrote the code as follows.  It works.  Writing a JUnit test for this is 
going to be extremely difficult unless we want to include the large test file 
in our code distribution (which I think would be a bad idea).
{code:java}
                long offset = bestFit.offset;
                int length = bestFit.length;
                if ((offset & 1L) != 0) {
                    // offsets have to be at a multiple of 2
                    offset += 1;
                    length -=1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = offset + outputItemLength;
                    final int excessLength = length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMPARATOR);
                    Collections.reverse(unusedElements);
                }
{code}


was (Author: gwlucas):
Okay, found it.

In the code below, the method looped through all the available free elements 
and found one it calls "bestFit".  It is going to store the new data into the 
available space.  But TIFF files have a rule that the offsets have to be an 
even multiple of 2.  So there's a check to see if the offset is odd and, if it 
is, the code advances the offset forward one.  The problem is that it doesn't 
recognize that by advancing the offset, it's reduced the amount of available 
space (bestFit.length).  So, the "excessLength" computation below will be 
incorrect.  If some subsequent element is an exact match for the incorrect 
excessLength value, it will overwrite the unused space and clobber whatever 
follows.   In this case, the thing that got clobbered was the first byte of 
EXIF tag 0x9010.

The probability of this happening is small, but non zero.  It is just luck that 
Sicheng Yang's data sample triggered the issue.

 
{code:java}
long offset = bestFit.offset;
                if ((offset & 1L) != 0) {
                    offset += 1;
                }
                outputItem.setOffset(offset);
                unusedElements.remove(bestFit);

                if (bestFit.length > outputItemLength) {
                    // not a perfect fit.
                    final long excessOffset = bestFit.offset + outputItemLength;
                    final int excessLength = bestFit.length - outputItemLength;
                    unusedElements.add(new TiffElement.Stub(excessOffset,
                            excessLength));
                    // make sure the new element is in the correct order.
                    unusedElements.sort(ELEMENT_SIZE_COMP

[jira] [Created] (IMAGING-329) Opportunity to enhance speed of PNG Read

2022-03-18 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-329:
--

 Summary: Opportunity to enhance speed of PNG Read
 Key: IMAGING-329
 URL: https://issues.apache.org/jira/browse/IMAGING-329
 Project: Commons Imaging
  Issue Type: Improvement
  Components: Format: PNG
 Environment:  
Reporter: Gary Lucas


When reading an image file, the PNG reader makes calls to 
BufferedImage.setRGB() for each pixel to be set in its output image.   The 
setRGB method has a lot of overhead, and we could speed up processing by 
calling setRGB on an entire row of pixels rather than one-at-a-time.

The expediter loop also makes calls to it's own getRGB method which is generic 
across all the different PNG formats (32-bit true color, 24-bit true-color, 
grayscale, indexed color).  This action involves a lot of conditional 
evaluation, switch statements across formats.  If we were to implement specific 
loops for the most common formats (24 and 32 bit true color), we could 
streamline reading for those formats.

I experimented with both of these approaches using a 5000-by-5000 RGB image 
(24-bit true color without transparency or gamma correction).  The results were:

Current Version 1-Alpha 3:      0.917 seconds
Set entire row of pixels:           0.717 seconds
Custom loop for format:          0.609 seconds

I note that the saving is not spectacular (it would have been more important a 
decade ago when computers were slower), particularly since images of such a 
large size don't usually use PNG as a data format.
 
{code:java}
for (int y = 0; y < height; y++) {
final byte[] unfiltered = getNextScanline(
is, pixelBytesPerScanLine, prev, bytesPerPixel);
prev = unfiltered;
final BitParser bitParser = new BitParser(
unfiltered, bitsPerPixel, bitDepth);

for (int x = 0; x < width; x++) {
final int rgb = getRGB(bitParser, x);
bi.setRGB(x, y, rgb);
}
}
{code}


And here's the modified inner loop for writing one row at a time:
{code:java}
final int []argb = new int[width];
for (int x = 0; x < width; x++) {
argb[x] = getRGB(bitParser, x);  // from ScanExpediterSimple
}
bi.setRGB(0, y, width, 1, argb, 0, width);
{code}
 

And, finally, here's a modified block that avoids the getRGB method and just 
processes bytes directly.   It has to check on a couple of pre-conditions to 
see if the data is in one of the frequently used formats, and then processes 
the bytes from the source file without any of the bit-access methods used by 
the ScanExpediterSimple class's getRGB method.
{code:java}
if (pngColorType == PngColorType.TRUE_COLOR 
&& transparencyFilter == null
&& gammaCorrection == null) 
{
for (int y = 0; y < height; y++) {
final byte[] unfiltered = getNextScanline(
is, pixelBytesPerScanLine, prev, bytesPerPixel);
int k = 0;
final int []argb = new int[width];
for (int x = 0; x < width; x++) {
int r = unfiltered[k++]&0xff;
int g = unfiltered[k++]&0xff;
int b = unfiltered[k++]&0xff;
argb[x] = 0xff00 | (r<<16)|(g<<8)|b;
}
bi.setRGB(0, y, width, 1, argb, 0, width);
}
return;
}
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (IMAGING-330) Implement PNG predictors to reduce output size

2022-03-28 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-330:
--

 Summary: Implement PNG predictors to reduce output size
 Key: IMAGING-330
 URL: https://issues.apache.org/jira/browse/IMAGING-330
 Project: Commons Imaging
  Issue Type: Improvement
  Components: Format: PNG
Reporter: Gary Lucas






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (IMAGING-330) Implement PNG predictors to reduce output size

2022-03-28 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-330:
---
Description: 
I propose to enhance the PngWriter class and PngImagingParameters class to 
allow the use of predictors. This change should reduce the size of output 
images written in PNG format. The resulting images will carry exactly the same 
data. There will be no loss of pixels or image quality. But the results will be 
smaller than those currently produced by either Imaging or Java's ImageIO class.

Background 

The PNG specification permits the use of optional predictors as part of its 
data compression logic. Predictors are applied through the use of a filter that 
transforms the data before it is passed to the conventional Deflate data 
compressor.  In some cases, predictors can result in a 30 percent reduction of 
file size. They are particularly suited to photographic images. Although they 
will work on graphics and line art, the reduction is often more modest.

You can find a description of predictors on 
[Wikipedia|https://en.wikipedia.org/wiki/Portable_Network_Graphics#Filtering]

The Java ImageIO class does not apply predictors as part of its processing.  
Consequently, if you write an image from a Java application using ImageIO, pull 
the image into Paint, and then save it under another name, the size of the 
image may actually decrease.  So when this feature is added to Commons Imaging, 
it will out perform ImageIO when writing PNGs.

> Implement PNG predictors to reduce output size
> --
>
> Key: IMAGING-330
> URL: https://issues.apache.org/jira/browse/IMAGING-330
> Project: Commons Imaging
>  Issue Type: Improvement
>  Components: Format: PNG
>Reporter: Gary Lucas
>Priority: Major
>
> I propose to enhance the PngWriter class and PngImagingParameters class to 
> allow the use of predictors. This change should reduce the size of output 
> images written in PNG format. The resulting images will carry exactly the 
> same data. There will be no loss of pixels or image quality. But the results 
> will be smaller than those currently produced by either Imaging or Java's 
> ImageIO class.
> Background 
> The PNG specification permits the use of optional predictors as part of its 
> data compression logic. Predictors are applied through the use of a filter 
> that transforms the data before it is passed to the conventional Deflate data 
> compressor.  In some cases, predictors can result in a 30 percent reduction 
> of file size. They are particularly suited to photographic images. Although 
> they will work on graphics and line art, the reduction is often more modest.
> You can find a description of predictors on 
> [Wikipedia|https://en.wikipedia.org/wiki/Portable_Network_Graphics#Filtering]
> The Java ImageIO class does not apply predictors as part of its processing.  
> Consequently, if you write an image from a Java application using ImageIO, 
> pull the image into Paint, and then save it under another name, the size of 
> the image may actually decrease.  So when this feature is added to Commons 
> Imaging, it will out perform ImageIO when writing PNGs.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (IMAGING-280) Length specifier for ByteSourceArray.

2021-02-23 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289156#comment-17289156
 ] 

Gary Lucas commented on IMAGING-280:


This is an interesting idea.

Taking a very quick look through the commons imaging source code, I didn't find 
anyplace obvious where the alternate constructor could be applied. Do you have 
a specific area in mind?   Are you thinking of using this in applications 
outside the Commons Imaging package itself?

I agree that this would be fairly easy to implement...  I think it would 
probably take more effort to write the JUnit tests than the code itself.

> Length specifier for ByteSourceArray.
> -
>
> Key: IMAGING-280
> URL: https://issues.apache.org/jira/browse/IMAGING-280
> Project: Commons Imaging
>  Issue Type: Improvement
>Reporter: Garret Wilson
>Priority: Major
>
> Many of the library processing methods take a {{ByteSource}}. The 
> {{ByteSourceArray}} allows a byte source from an array of bytes, but 
> unfortunately it does not allow specification of the number of bytes, 
> assuming that the entire byte array is used; e.g.:
> {code:java}
> public ByteSourceArray(final byte[] bytes) {
> this(null, bytes);
> }
> {code}
> This severely impedes the use of the class if the code using 
> {{ByteSourceArray}} has a byte array partially filled. The obvious case is 
> processing data in a pipeline, when the producer has written to a 
> {{ByteArrayOutputStream}}. Although {{ByteArrayOutputStream.toByteArray()}} 
> provides a copy of the internal data, it is possible to subclass 
> {{ByteArrayOutputStream}} to get access to the underlying bytes to prevent 
> copying. Because {{ByteArrayOutputStream}} grows dynamically, the internal 
> byte array may not be full.
> Thus {{ByteSourceArray}} needs a separate constructor to indicate the length 
> (and even the offset), just like {{ByteArrayInputStream}} does:
> {code:java}
> public ByteArrayInputStream(byte buf[], int offset, int length) {…}
> {code}
> Moreover this is extremely trivial to add. Without it, however, the developer 
> is forced to basically reimplement the entire class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-29 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310787#comment-17310787
 ] 

Gary Lucas commented on IMAGING-284:


What version of Commons Imaging were you using?

I ask because the code currently in the Github repository should be able to 
handle TIFF's with transparency.  However, that code is not yet been released 
to the Maven Central Repository.  So if you were using the compiled Jars from 
the Maven Central Repository, they would not handle your image file properly.  
But if you downloaded code from Github and compiled your own Jars, they ought 
to work.  If they don't then there could be a bug in the implementation.

I don't manage the Commons Imaging project, but I did contribute the code 
enhancements for the TIFF transparency feature. So if there is a problem, I 
would be anxious to find it.

How big is your test file?  If it's not too big, would it be possible to attach 
it to this Jira issue?  I do not have access to mediafire.com

 

 

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-29 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310787#comment-17310787
 ] 

Gary Lucas edited comment on IMAGING-284 at 3/29/21, 4:57 PM:
--

What version of Commons Imaging were you using?

I ask because the code currently in the Github repository should be able to 
handle TIFF's with transparency.  However, that code is not yet been released 
to the Maven Central Repository.  So if you were using the compiled Jars from 
the Maven Central Repository, they would not handle your image file properly.  
But if you downloaded code from Github and compiled your own Jars, they ought 
to work.  If they don't then there could be a bug in the implementation.

I don't manage the Commons Imaging project, but I did contribute the code 
enhancements for the TIFF transparency feature. So if there is a problem, I 
would be anxious to find it.

 

 


was (Author: gwlucas):
What version of Commons Imaging were you using?

I ask because the code currently in the Github repository should be able to 
handle TIFF's with transparency.  However, that code is not yet been released 
to the Maven Central Repository.  So if you were using the compiled Jars from 
the Maven Central Repository, they would not handle your image file properly.  
But if you downloaded code from Github and compiled your own Jars, they ought 
to work.  If they don't then there could be a bug in the implementation.

I don't manage the Commons Imaging project, but I did contribute the code 
enhancements for the TIFF transparency feature. So if there is a problem, I 
would be anxious to find it.

How big is your test file?  If it's not too big, would it be possible to attach 
it to this Jira issue?  I do not have access to mediafire.com

 

 

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-31 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312746#comment-17312746
 ] 

Gary Lucas commented on IMAGING-284:


I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured by 
Chrome browser search.  I'm not going back.  So in absence a better way of 
accessing the material, I don't think there's much we can do.

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-31 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312746#comment-17312746
 ] 

Gary Lucas edited comment on IMAGING-284 at 3/31/21, 10:38 PM:
---

I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured my 
Chrome browser search.  I'm not going back.  So in absence a better way of 
accessing the material, I don't think there's much we can do.


was (Author: gwlucas):
I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured by 
Chrome browser search.  I'm not going back.  So in absence a better way of 
accessing the material, I don't think there's much we can do.

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-31 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312746#comment-17312746
 ] 

Gary Lucas edited comment on IMAGING-284 at 3/31/21, 10:38 PM:
---

I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured my 
Chrome browser search.  I'm not going back.  So in the absence of a better way 
of accessing the material, I don't think there's much we can do.


was (Author: gwlucas):
I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured my 
Chrome browser search.  I'm not going back.  So in absence a better way of 
accessing the material, I don't think there's much we can do.

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-31 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312746#comment-17312746
 ] 

Gary Lucas edited comment on IMAGING-284 at 3/31/21, 11:03 PM:
---

I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured my 
Chrome browser search.  On the other hand, the file did download, so I took a 
look.

I wrote a simple test program that read the TIFF file, obtained a buffered 
image, and then drew that image to an "composite" image with a blue background. 
 Everything seemed to work fine.  

The code
{code:java}
public static void main(String[] args) throws IOException, ImageReadException {
File input = new File(args[0]);
Map params = new HashMap<>();
BufferedImage bImage = Imaging.getBufferedImage(input, params);
int w = bImage.getWidth();
int h = bImage.getHeight();
BufferedImage composite = new BufferedImage(w, h, 
BufferedImage.TYPE_INT_RGB);
Graphics2D g = composite.createGraphics();
g.setColor(Color.blue);
g.fillRect(0, 0, w + 1, h + 1);
g.drawImage(bImage, 0, 0, w, h, null);File output = new 
File("Test.jpg");
System.out.println("Writing image to " + output.getPath());
ImageIO.write(composite, "JPG", output);
  }

{code}
 


was (Author: gwlucas):
I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured my 
Chrome browser search.  I'm not going back.  So in the absence of a better way 
of accessing the material, I don't think there's much we can do.

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-31 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-284:
---
Comment: was deleted

(was: I'm deeply suspicious of this mediafire website.  When I attempted to 
download your file, they tried to push some adware onto my computer and 
reconfigured my Chrome browser search.  On the other hand, the file did 
download, so I took a look.

I wrote a simple test program that read the TIFF file, obtained a buffered 
image, and then drew that image to an "composite" image with a blue background. 
 Everything seemed to work fine.  

The code
{code:java}
public static void main(String[] args) throws IOException, ImageReadException {
File input = new File(args[0]);
Map params = new HashMap<>();
BufferedImage bImage = Imaging.getBufferedImage(input, params);
int w = bImage.getWidth();
int h = bImage.getHeight();
BufferedImage composite = new BufferedImage(w, h, 
BufferedImage.TYPE_INT_RGB);
Graphics2D g = composite.createGraphics();
g.setColor(Color.blue);
g.fillRect(0, 0, w + 1, h + 1);
g.drawImage(bImage, 0, 0, w, h, null);File output = new 
File("Test.jpg");
System.out.println("Writing image to " + output.getPath());
ImageIO.write(composite, "JPG", output);
  }

{code}
 )

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-31 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-284:
---
Attachment: Test.jpg

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
> Attachments: Test.jpg
>
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-31 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312763#comment-17312763
 ] 

Gary Lucas commented on IMAGING-284:


I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured my 
Chrome browser search.  On the other hand, the file did download, so I took a 
look.

I wrote a simple test program that read the TIFF file, obtained a buffered 
image, and then drew that image to an "composite" image with a blue background. 
 Everything seemed to work fine.  
{code:java}
public static void main(String[] args) throws IOException, ImageReadException {
File input = new File(args[0]);
Map params = new HashMap<>();
BufferedImage bImage = Imaging.getBufferedImage(input, params);
int w = bImage.getWidth();
int h = bImage.getHeight();
BufferedImage composite = new BufferedImage(w, h, 
BufferedImage.TYPE_INT_RGB);
Graphics2D g = composite.createGraphics();
g.setColor(Color.blue);
g.fillRect(0, 0, w + 1, h + 1);
g.drawImage(bImage, 0, 0, w, h, null);File output = new 
File("Test.jpg");
System.out.println("Writing image to " + output.getPath());
ImageIO.write(composite, "JPG", output);
  }

{code}
!Test.jpg!

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
> Attachments: Test.jpg
>
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-284) Tif file with transparent background result in black background png

2021-03-31 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312763#comment-17312763
 ] 

Gary Lucas edited comment on IMAGING-284 at 3/31/21, 11:05 PM:
---

I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured my 
Chrome browser search.  On the other hand, the file did download, so I took a 
look.

I wrote a simple test program that read the TIFF file, obtained a buffered 
image, and then drew that image to an "composite" image with a blue background. 
 Everything seemed to work fine.  
{code:java}
public static void main(String[] args) throws IOException, ImageReadException {
File input = new File(args[0]);
Map params = new HashMap<>();
BufferedImage bImage = Imaging.getBufferedImage(input, params);
int w = bImage.getWidth();
int h = bImage.getHeight();
BufferedImage composite = new BufferedImage(w, h, 
BufferedImage.TYPE_INT_RGB);
Graphics2D g = composite.createGraphics();
g.setColor(Color.blue);
g.fillRect(0, 0, w + 1, h + 1);
g.drawImage(bImage, 0, 0, w, h, null);

File output = new File("Test.jpg");
System.out.println("Writing image to " + output.getPath());
ImageIO.write(composite, "JPG", output);
  }

{code}
!Test.jpg!


was (Author: gwlucas):
I'm deeply suspicious of this mediafire website.  When I attempted to download 
your file, they tried to push some adware onto my computer and reconfigured my 
Chrome browser search.  On the other hand, the file did download, so I took a 
look.

I wrote a simple test program that read the TIFF file, obtained a buffered 
image, and then drew that image to an "composite" image with a blue background. 
 Everything seemed to work fine.  
{code:java}
public static void main(String[] args) throws IOException, ImageReadException {
File input = new File(args[0]);
Map params = new HashMap<>();
BufferedImage bImage = Imaging.getBufferedImage(input, params);
int w = bImage.getWidth();
int h = bImage.getHeight();
BufferedImage composite = new BufferedImage(w, h, 
BufferedImage.TYPE_INT_RGB);
Graphics2D g = composite.createGraphics();
g.setColor(Color.blue);
g.fillRect(0, 0, w + 1, h + 1);
g.drawImage(bImage, 0, 0, w, h, null);File output = new 
File("Test.jpg");
System.out.println("Writing image to " + output.getPath());
ImageIO.write(composite, "JPG", output);
  }

{code}
!Test.jpg!

> Tif file with transparent background result in black background png
> ---
>
> Key: IMAGING-284
> URL: https://issues.apache.org/jira/browse/IMAGING-284
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: PNG
>Reporter: Gaurav Gupta
>Priority: Major
> Attachments: Test.jpg
>
>
> When a tiff file (sample file [0]) with transparent background is converted 
> to png file using commons-imaging library, the background in png file is 
> black instead of transparent.
> [0] https://www.mediafire.com/file/tl0r9tsvx6a16zg/Test.tif/file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (SANSELAN-54) Tiff (exif) tags of type double written in wrong byte order

2011-09-24 Thread Gary Lucas (JIRA)
Tiff (exif) tags of type double written in wrong byte order
---

 Key: SANSELAN-54
 URL: https://issues.apache.org/jira/browse/SANSELAN-54
 Project: Commons Sanselan
  Issue Type: Bug
 Environment: Tested under Windows XP.  Potentally all platforms.
Reporter: Gary Lucas


Reviewing BinaryFileFunctions.java method convertDoubleToByteArray and 
convertDoubleArrayToByteArray there are two blocks of code, one for 
BYTE_ORDER_MOTOROLA (big endian) and one for BYTE_ORDER_INTEL (little endian).  
These are backwards.  

The convertByteArrayToDouble, on the other hand appears to be correct.  A 
reasonable test procedure would be to see if these two sets of methods are 
mutually consistent.

The same problem appears to be the case for the "Float" variants, but not the 
Integer variants.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (SANSELAN-58) Streamlined TIFF strip reader reduces load time by a factor of 5

2012-04-22 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/SANSELAN-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13259230#comment-13259230
 ] 

Gary Lucas commented on SANSELAN-58:


I have reworked this code based on comments received on my earlier submissions 
and am uploading a patch titled Tracker_Item_58_22_Apr_2012.  Please disregard 
my earlier patch.

In particular, I have tried to keep the changes narrowly focused and easy to 
follow. This patch affects only the strip reader for TIFF files.   It is based 
on the latest trunk of the code as of today.

In testing with large TIFF files, this enhancement reduces the reading time of 
a TIFF file by more than 50 percent. 



 

> Streamlined TIFF strip reader reduces load time by a factor of 5
> 
>
> Key: SANSELAN-58
> URL: https://issues.apache.org/jira/browse/SANSELAN-58
> Project: Commons Sanselan
>  Issue Type: Improvement
>Reporter: Gary Lucas
> Attachments: Sanselan-58-TiffStripReaderSpeed.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Testing reveals that streamlining the DataReaderStrip.java operations for 8 
> and 24 bit-per-pixel TIFF images reduces the TIFF file load time by a factor 
> of 5.  
> For each pixel in images of these types, the interpretStrip() method of 
> DataReaderStrip makes calls to a generic bit extractor using its 
> getSamplesAsBytes() method.  Internally, this method simply copies the 
> requisite number of bytes (8 or 24), but it executes a lot of conditional 
> statements to do so.  Under most architectures, conditionals tend to take 2 
> to 3 times as long to execute as simple arithmetic statements, so this 
> approach is expensive (especially since an image may contain millions of 
> pixels).  While the implementation is very generic, the majority of TIFF 
> files out there appear to fall into two simple categories.  By implementing 
> specialized code for these two cases, the loading time for TIFF images is 
> dramatically reduced.
> The following snippet shows the code I used for testing.  It was added right 
> at the beginning of the interpretStrip() method.
>  // Oct 2011 changes.
> //  The general case decoder is based on the idea of using a 
> //  generic bit-reader to unpack the number of bytes that are
> //  needed.  Although it is efficiently implemented, it does
> //  require performing at least three conditional branches per sample
> //  extracted (and often more).   This change attempts to bypass that
> //  overhead by implementing specialized blocks of extraction code
> //  for commonly used 8 bitsPerPixel and 24 bitsPerPixel cases.
> //  In other cases, it will simply fall through to the original code.
> //note that when promoting a byte to an integer, it is necessary
> //to mask it with 0xff because the Java byte type is signed
> //an this implementation requires an unsigned value
> if(x>=width)
> {
> // this may not be required.  it was coded based on the 
> // original implementation.  But looking at the TIFF 6.0 spec,
> // it looks like the rows always evenly fill out the strip,
> // so there should never be a partial row in a strip and x
> // should not be anything except zero.
> x = 0;
> y++;
> }
> if(y>=height)
> {
> // we check it once before starting, so that we don't have
> // to check it redundantly for each pixel
> return;
> }
> 
> if(predictor==-1 && this.bitsPerPixel==8)
> {
> int [] samples = new int[1];
> for(int i=0; i {
> samples[0] = bytes[i]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> } 
> return;
> }
> else if(predictor==-1 && this.bitsPerPixel==24)
> {
> int [] samples = new int[3];
> int k = 0;
> for(int i=0; i {
> samples[0] = bytes[k++]&0x00ff;
> samples[1] = bytes[k++]&0x00ff;
> samples[2] = bytes[k++]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
>

[jira] [Updated] (SANSELAN-58) Streamlined TIFF strip reader reduces load time by a factor of 5

2012-04-22 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/SANSELAN-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated SANSELAN-58:
---

Attachment: Tracker_Item_58_22_Apr_2012.patch

> Streamlined TIFF strip reader reduces load time by a factor of 5
> 
>
> Key: SANSELAN-58
> URL: https://issues.apache.org/jira/browse/SANSELAN-58
> Project: Commons Sanselan
>  Issue Type: Improvement
>Reporter: Gary Lucas
> Attachments: Sanselan-58-TiffStripReaderSpeed.patch, 
> Tracker_Item_58_22_Apr_2012.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Testing reveals that streamlining the DataReaderStrip.java operations for 8 
> and 24 bit-per-pixel TIFF images reduces the TIFF file load time by a factor 
> of 5.  
> For each pixel in images of these types, the interpretStrip() method of 
> DataReaderStrip makes calls to a generic bit extractor using its 
> getSamplesAsBytes() method.  Internally, this method simply copies the 
> requisite number of bytes (8 or 24), but it executes a lot of conditional 
> statements to do so.  Under most architectures, conditionals tend to take 2 
> to 3 times as long to execute as simple arithmetic statements, so this 
> approach is expensive (especially since an image may contain millions of 
> pixels).  While the implementation is very generic, the majority of TIFF 
> files out there appear to fall into two simple categories.  By implementing 
> specialized code for these two cases, the loading time for TIFF images is 
> dramatically reduced.
> The following snippet shows the code I used for testing.  It was added right 
> at the beginning of the interpretStrip() method.
>  // Oct 2011 changes.
> //  The general case decoder is based on the idea of using a 
> //  generic bit-reader to unpack the number of bytes that are
> //  needed.  Although it is efficiently implemented, it does
> //  require performing at least three conditional branches per sample
> //  extracted (and often more).   This change attempts to bypass that
> //  overhead by implementing specialized blocks of extraction code
> //  for commonly used 8 bitsPerPixel and 24 bitsPerPixel cases.
> //  In other cases, it will simply fall through to the original code.
> //note that when promoting a byte to an integer, it is necessary
> //to mask it with 0xff because the Java byte type is signed
> //an this implementation requires an unsigned value
> if(x>=width)
> {
> // this may not be required.  it was coded based on the 
> // original implementation.  But looking at the TIFF 6.0 spec,
> // it looks like the rows always evenly fill out the strip,
> // so there should never be a partial row in a strip and x
> // should not be anything except zero.
> x = 0;
> y++;
> }
> if(y>=height)
> {
> // we check it once before starting, so that we don't have
> // to check it redundantly for each pixel
> return;
> }
> 
> if(predictor==-1 && this.bitsPerPixel==8)
> {
> int [] samples = new int[1];
> for(int i=0; i {
> samples[0] = bytes[i]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> } 
> return;
> }
> else if(predictor==-1 && this.bitsPerPixel==24)
> {
> int [] samples = new int[3];
> int k = 0;
> for(int i=0; i {
> samples[0] = bytes[k++]&0x00ff;
> samples[1] = bytes[k++]&0x00ff;
> samples[2] = bytes[k++]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> }   
> return;
> }
> 
> // original code before Oct 2011 modification
> ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
> BitInputStream bis = new BitInputStream(bais);
> etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issu

[jira] [Commented] (SANSELAN-56) proposed enhancement reduces load time for some image files by 40 percent

2012-04-22 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/SANSELAN-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13259236#comment-13259236
 ] 

Gary Lucas commented on SANSELAN-56:


Damjan,

I see that you've integrated the most important change from this proposal into 
the main trunk. 

There is still a significant improvement to be realized in terms of the 
photometric interpreter. I took your comments to heart about me overloading my 
last submission with too many changes. So I have implemented a new version of 
the interpreter that is derived from the current code trunk and focuses on only 
the relevant changes.I would like to submit it.

Would it make sense to close out this tracker item and have me enter a new item 
that addresses just the change to the photometric interpreter?   I am ready to 
submit the new patch as soon as I hear from you.

Gary



> proposed enhancement reduces load time for some image files by 40 percent
> -
>
> Key: SANSELAN-56
> URL: https://issues.apache.org/jira/browse/SANSELAN-56
> Project: Commons Sanselan
>  Issue Type: Improvement
> Environment: Tested in Windows, Linux, MacOS
>Reporter: Gary Lucas
>  Labels: api-change
> Attachments: Sanselan-56-SpeedEnhanceTiff.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I have identified an enhancement that reduces the time required to load TIFF 
> image by 40 percent.  I have tested a modified version of Sanselan under 
> Windows, Linux, and MacOS with consistent savings on each platform.  
> Additionally, I suspect that this technique may be applicable to other areas 
> of the Sanselan code base, including more popular image formats supported by 
> Sanselan such as JPEG, PNG, etc.
> I propose to add the relevant code changes to the Sanselan code base.  Once 
> these modifications are in place, there would be an opportunity for others to 
> look at the pro's and cons' of applying the techniques to other data formats.
> The Enhancement
> To load an image from a TIFF file, Sanselan performs extensive data 
> processing in order to obtain RGB values for the pixels in the output image. 
> The code for that processing appears to be well written and efficient. Once 
> the RGB value are obtained, they are stored in a Java BufferedImage using a 
> call  to the setRGB() method.
> Unfortunately, setRGB() is an extremely inefficient method.   A much, much 
> better approach is to store the data into an integer array and defer the 
> creation of the buffered image until all information for the image has been 
> collected.Java has a nice (though somewhat obscure) API that lets memory 
> in an integer array be transferred directly to a BufferedImage so that the 
> system does not have to allocate additional memory for this procedure (a very 
> nice feature when dealing with huge images).  This change virtually 
> eliminated the overhead for transferring data to images, which accounted for 
> 40 percent of the time required to load images.  For TIFF files, this was a 
> reasonable approach because the TiffImageParser class always loads 4-byte 
> image  and the getGrayscaleBufferedImage() method is never used.  I have not 
> investigated the code for the other renders, but some refinement might be 
> needed for the one-byte grayscale images.
> Steps to Integration
> In sanselan.common, a new class called ImagePrep was created.  ImagePrep 
> carries a width, height, and an integer array for storing pixels.  It 
> provides its own setRGB() method which looks just like the one in 
> BufferedImage.   Finally, it provides a method called getBufferedImage() 
> which creates a BufferedImage from its internal the integer array when the 
> processing is complete.
> In the TiffImageParser classes, data is read from input stream and 
> transferred to pixel values in a series of classes known as 
> PhotometricInterpreters.  These were modified to operate on ImagePrep objects 
> rather than BufferedImage objects.  The DataReader and TiffImageParser 
> classes were modified to pass ImagePrep objects into the photometric 
> interpreters rather than using BufferedImages.
> At the very last step, before passing its result back to the calling method 
> (the Sanselan main class, etc.), the TiffImageParser used the 
> ImagePrep.getBufferedImage() to convert the result to the expected form.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (SANSELAN-58) Streamlined TIFF strip reader reduces load time by a factor of 5

2012-04-22 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/SANSELAN-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13259306#comment-13259306
 ] 

Gary Lucas commented on SANSELAN-58:


Since the patch of 22 April 2012 claims enhanced speed of loading, I thought 
that some hard data might be in order.  One thing to keep in mind is that TIFF 
files are often huge.  The 25 megapixel file used for the test described below 
is not even considered especially large.   Thus any operation that is performed 
once-per-pixel is repeated 25 million times.  So if we can avoid even a single 
if/then operation, the savings adds up.



The Enhancements 

The TIFF specification provides several formats for the storage of data. This 
patch optimizes the software for two special cases that represent the most 
widely used TIFF formats: 24-bit RGB and 8-bit grayscale or indexed palette. 

The 24-bit RGB enhancement involves 3 changes:
1) The old code implemented a generic getSamplesAsBytes method to extract data 
from the raw bit stream.  This method was invoked once per pixel.  Each time it 
was called, it needed to execute logic to see which format it was reading and 
which code branch to traverse.  By detecting the special case 24-bit RBG format 
before enterring the loop, this per-pixel overhead was avoided.

2) Reorganize the access loop with a nested row/column loop, eliminating one 
conditional operation per pixel

3) The old code invoked a method called a "photometricInterpreter" to pack 
R,G,B values into a single integer for storage into the final product.  Replace 
this method width byte-manipulation logic built into the loop (i.e. in-line the 
logic for the byte manipulation).  I tried several different ways of coding the 
byte-shifting to find one that was particularly fast.


The Test Procedure ---
The testing was performed on a 5000-by-5000 pixel 24-bite RGB TIFF image.
The testing procedure loads the image 10 times, and records how long it takes 
for each operation.  When accumulating an average load time, it throws out the 
first two load operations.  So even though 10 tests are performed, the average 
load time is based on 8 tests.  Ignoring the first load operation makes sense 
because it is affected by things like class loading and JIT compiling and 
always takes longer than those that follow.  In other words, it is contaminated 
by timing factors other than those we wish to measure. The second load 
operation is also ignored because, under Linux, I've observed that the second 
load observation sometimes shows evidence timing contamination (though to a 
lesser degree than the first).   Between load operations, the test routine 
explicitly invokes the Java Runtime garbage collection method and then executes 
a 1 seconds sleep to give the garbage collection time to complete.  The purpose 
of the explicit garbage collection operation is to avoid contaminating time 
measurements during a load test where the garbage collector might be cleaning 
up memory from a previous load operation.

The Results:
Original Implementation:   2261.356  ms.
Remove call to getSamplesAsBytes:   948.304  ms.
Reorganize access loop: 879.196  ms.
In-Line photometric interpreter: 624.951 ms.

Total reduction:   72 percent

> Streamlined TIFF strip reader reduces load time by a factor of 5
> 
>
> Key: SANSELAN-58
> URL: https://issues.apache.org/jira/browse/SANSELAN-58
> Project: Commons Sanselan
>  Issue Type: Improvement
>Reporter: Gary Lucas
> Attachments: Sanselan-58-TiffStripReaderSpeed.patch, 
> Tracker_Item_58_22_Apr_2012.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Testing reveals that streamlining the DataReaderStrip.java operations for 8 
> and 24 bit-per-pixel TIFF images reduces the TIFF file load time by a factor 
> of 5.  
> For each pixel in images of these types, the interpretStrip() method of 
> DataReaderStrip makes calls to a generic bit extractor using its 
> getSamplesAsBytes() method.  Internally, this method simply copies the 
> requisite number of bytes (8 or 24), but it executes a lot of conditional 
> statements to do so.  Under most architectures, conditionals tend to take 2 
> to 3 times as long to execute as simple arithmetic statements, so this 
> approach is expensive (especially since an image may contain millions of 
> pixels).  While the implementation is very generic, the majority of TIFF 
> files out there appear to fall into two simple categories.  By implementing 
> specialized code for these two cases, the loading time for TIFF images is 
> dramatically reduced.
> The following snippet shows the code I used for testing.  It was added right 
> at the beginning of the interpretStrip() method.
>  //

[jira] [Created] (SANSELAN-75) Improve speed of TIFF Index-Color Palette

2012-04-23 Thread Gary Lucas (JIRA)
Gary Lucas created SANSELAN-75:
--

 Summary: Improve speed of TIFF Index-Color Palette 
 Key: SANSELAN-75
 URL: https://issues.apache.org/jira/browse/SANSELAN-75
 Project: Commons Sanselan
  Issue Type: Improvement
  Components: Format: TIFF
Reporter: Gary Lucas


TIFF supports an 8-bit-per-pixel, "indexed color" model in which the values in 
the source data give indices into a fixed palette of color settings. Unpacking 
data in this format requires two steps:  (1) reading and unpacking the palette 
in the source file, (2) performing a look-up operation for each pixel to map 
the index value to a color.  A similar approach was used in the original GIF 
specification and, checking wikipedia, it appears that it is also supported by 
PNG and BMP.  TIFF files in this format are used extensively in mapping and 
Geographic Information System implementation.

The current inplementation of the PhotometicInterpreterPalette for TIFF files 
has an inefficiency in that it performs the unpacking operation from step one 
each time a new pixel is read. Thus, if a file contains three million red 
pixels, the byte-manipulation for the color red is repeated three million 
times.  It would be more efficient for the PhotometricInterpreterPalette class 
to perform the unpacking operation once, in its constructor. Since there are a 
maximum of 256 colors in an 8-bit palette, this approach would require moderate 
use of memory.

In time tests this approach reduced the reading time for a large TIFF file by 
about 13 percent (for details on the timing procedures, see Tracker Item 58):

Unmodified Code:   1204.9 ms.
After change:   1049.7 ms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (SANSELAN-75) Improve speed of TIFF Index-Color Palette

2012-04-23 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/SANSELAN-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated SANSELAN-75:
---

Attachment: IndexColorPalette.tif
Tracker_Item_75_23_Apr_2012.patch

The attached patch addresses the issue.  Since there are no TIFF format files 
of the relevant type in the Sanselan test suite, I am also attaching one.

> Improve speed of TIFF Index-Color Palette 
> --
>
> Key: SANSELAN-75
> URL: https://issues.apache.org/jira/browse/SANSELAN-75
> Project: Commons Sanselan
>  Issue Type: Improvement
>  Components: Format: TIFF
>Reporter: Gary Lucas
> Attachments: IndexColorPalette.tif, Tracker_Item_75_23_Apr_2012.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> TIFF supports an 8-bit-per-pixel, "indexed color" model in which the values 
> in the source data give indices into a fixed palette of color settings. 
> Unpacking data in this format requires two steps:  (1) reading and unpacking 
> the palette in the source file, (2) performing a look-up operation for each 
> pixel to map the index value to a color.  A similar approach was used in the 
> original GIF specification and, checking wikipedia, it appears that it is 
> also supported by PNG and BMP.  TIFF files in this format are used 
> extensively in mapping and Geographic Information System implementation.
> The current inplementation of the PhotometicInterpreterPalette for TIFF files 
> has an inefficiency in that it performs the unpacking operation from step one 
> each time a new pixel is read. Thus, if a file contains three million red 
> pixels, the byte-manipulation for the color red is repeated three million 
> times.  It would be more efficient for the PhotometricInterpreterPalette 
> class to perform the unpacking operation once, in its constructor. Since 
> there are a maximum of 256 colors in an 8-bit palette, this approach would 
> require moderate use of memory.
> In time tests this approach reduced the reading time for a large TIFF file by 
> about 13 percent (for details on the timing procedures, see Tracker Item 58):
> Unmodified Code:   1204.9 ms.
> After change:   1049.7 ms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (SANSELAN-56) proposed enhancement reduces load time for some image files by 40 percent

2012-04-23 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/SANSELAN-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13259844#comment-13259844
 ] 

Gary Lucas commented on SANSELAN-56:


Damjan,

I entered a new Tracker Item (#75) for the change to the photometric 
interpreter.  That completes the last of the changes I originally posted for 
this item.  So I think you can close or reject this item as you see fit.

Gary



> proposed enhancement reduces load time for some image files by 40 percent
> -
>
> Key: SANSELAN-56
> URL: https://issues.apache.org/jira/browse/SANSELAN-56
> Project: Commons Sanselan
>  Issue Type: Improvement
> Environment: Tested in Windows, Linux, MacOS
>Reporter: Gary Lucas
>  Labels: api-change
> Attachments: Sanselan-56-SpeedEnhanceTiff.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I have identified an enhancement that reduces the time required to load TIFF 
> image by 40 percent.  I have tested a modified version of Sanselan under 
> Windows, Linux, and MacOS with consistent savings on each platform.  
> Additionally, I suspect that this technique may be applicable to other areas 
> of the Sanselan code base, including more popular image formats supported by 
> Sanselan such as JPEG, PNG, etc.
> I propose to add the relevant code changes to the Sanselan code base.  Once 
> these modifications are in place, there would be an opportunity for others to 
> look at the pro's and cons' of applying the techniques to other data formats.
> The Enhancement
> To load an image from a TIFF file, Sanselan performs extensive data 
> processing in order to obtain RGB values for the pixels in the output image. 
> The code for that processing appears to be well written and efficient. Once 
> the RGB value are obtained, they are stored in a Java BufferedImage using a 
> call  to the setRGB() method.
> Unfortunately, setRGB() is an extremely inefficient method.   A much, much 
> better approach is to store the data into an integer array and defer the 
> creation of the buffered image until all information for the image has been 
> collected.Java has a nice (though somewhat obscure) API that lets memory 
> in an integer array be transferred directly to a BufferedImage so that the 
> system does not have to allocate additional memory for this procedure (a very 
> nice feature when dealing with huge images).  This change virtually 
> eliminated the overhead for transferring data to images, which accounted for 
> 40 percent of the time required to load images.  For TIFF files, this was a 
> reasonable approach because the TiffImageParser class always loads 4-byte 
> image  and the getGrayscaleBufferedImage() method is never used.  I have not 
> investigated the code for the other renders, but some refinement might be 
> needed for the one-byte grayscale images.
> Steps to Integration
> In sanselan.common, a new class called ImagePrep was created.  ImagePrep 
> carries a width, height, and an integer array for storing pixels.  It 
> provides its own setRGB() method which looks just like the one in 
> BufferedImage.   Finally, it provides a method called getBufferedImage() 
> which creates a BufferedImage from its internal the integer array when the 
> processing is complete.
> In the TiffImageParser classes, data is read from input stream and 
> transferred to pixel values in a series of classes known as 
> PhotometricInterpreters.  These were modified to operate on ImagePrep objects 
> rather than BufferedImage objects.  The DataReader and TiffImageParser 
> classes were modified to pass ImagePrep objects into the photometric 
> interpreters rather than using BufferedImages.
> At the very last step, before passing its result back to the calling method 
> (the Sanselan main class, etc.), the TiffImageParser used the 
> ImagePrep.getBufferedImage() to convert the result to the expected form.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (SANSELAN-76) Reduce memory use of TIFF readers

2012-04-24 Thread Gary Lucas (JIRA)
Gary Lucas created SANSELAN-76:
--

 Summary: Reduce memory use of TIFF readers
 Key: SANSELAN-76
 URL: https://issues.apache.org/jira/browse/SANSELAN-76
 Project: Commons Sanselan
  Issue Type: Improvement
  Components: Format: TIFF
Reporter: Gary Lucas


This Tracker Item proposes changes to the TIFF file readers to address memory 
issues when reading very large images from TIFF files.  The TIFF format is used 
extensively in technical applications such as aerial photographs, satellite 
images, and digital raster maps which feature very large image sizes.  For 
example, the public-domain Natural Earth Data set features raster files sized 
21,600 by 10,800 pixels (222.5 megapixels).   Although this example is 
unusually large, image sizes of 25 to 100 megapixels are common for such 
applications.

Unfortunately, when Sanselan reads a TIFF image, it consumes nearly twice as 
much memory as is necessary.  The reader operates in two stages. First, it 
reads the entire source file into memory then it builds the output image, also 
in memory.   In the example file mentioned above, the source data runs from 
83.19 to 373 megabytes (depending on compression).   Thus Sanselan would 
require a minimum of 83.19+4*222.5 = 985 megabytes to produce an image for one 
of these files (allowing 4 bytes per pixel in the output BufferedImage)

Fortunately, TIFF files are organized so that they can be read a piece at a 
time.  TIFF files are divided into either strips or tiles and, if data 
compression is used, each piece is compressed individually.  Thus each 
individual piece has no dependency on the other. 

This item proposes to implement two changes:

1)  Allow the TIFF data reader to read the files one piece at a time while 
constructing the buffered image.  Thus the memory use for reading would be no 
larger than the piece size.  This would be an internal change, so the external 
appearance of the Sanselan getBufferedImage methods would not change.

2) Provide new API elements that permit applications to read the strips or 
tiles from TIFF files individually. This change would support applications 
that needed to access very large TIFF files without committing the memory to 
store a BufferedImage for the entire file (a 222.5 megapixel image requires 890 
megabytes, which is a lot even by contemporary standards).

There is one minor issue in this implementation that is easily addressed.  
Sanselan reads images from ByteSources that can be either random-access files 
or sequential-access input streams.  In the case of sequential-input streams, 
it may be hard to perform a partial read on a TIFF directory.  In such a case, 
the TIFF access routines might have to resort to reading the entire source data 
into memory as it currently does.   This would simply be a limitation of the 
implementation.

There is one issue that may make this change a bit problematic.  The TIFF 
processors depend on accessing a class called TiffDataElement that contains a 
public array of bytes called "data".   The most expeditious way of implementing 
the enchancement is to make this element private and add an accessor that 
either returns the data from internal memory or else loads it on-demand.  
Unfortunately, because the data element is scoped to public, there is a chance 
that some existing applications are using it directly.   In hindsight, it is 
clear that scoping this element as public was a mistake, but it may be too late 
to fix it.  So care will be required to ensure that compatibility remains.   
The most likely solution seems to be to implement a new class for passing raw 
data from the source TIFF files to the DataReader implementations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (SANSELAN-76) Reduce memory use of TIFF readers

2012-04-28 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/SANSELAN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264443#comment-13264443
 ] 

Gary Lucas commented on SANSELAN-76:



I've basically completed the first of the two changes proposed above.  This 
change reduces the amount of memory needed to load a TIFF image.   I think it 
would probably be best if I treated the second proposed change as a different 
tracker item (so as to keep the scope of this change small and managable).

The following shows a comparison of the Before and After versions.  Both the 
amount of memory used for live objects and for the JVM as a whole are 
significantly reduced.

{{monospaced}}
This is tested using a 1-by-1 TIFF file from the U.S. Geological 
Survey.  The source file uses the 24-bit RGB format and is 286.2 MB.  The 
output image is 381 MB. in size.  Memory stats are extracted from the Java 
Runtime class and collected before the TiffParser goes out-of-scope.


Before change
time to load image   memory
 time ms  avg ms used mb   total mb
 2391.575 0.000--670.951  1015.375 
 1797.042 0.000--675.350  1160.000 
 1703.935  1703.935--670.298  1045.633 
 1924.843  1814.389--671.955  1015.188 
 1708.914  1779.231--672.305  1160.000 
 1687.799  1756.373--670.298  1045.633 
 1927.832  1790.665--672.176  1015.188 
 1794.254  1791.263--670.789  1160.000 
 1698.290  1777.981--670.298  1045.633 
 1928.838  1796.838--672.220  1015.188


After
 time to load image   memory
 time ms  avg ms used mb   total mb
 2128.425 0.000--382.990   397.035 
 1823.000 0.000--528.913   568.723 
 1845.152  1845.152--413.471   695.723 
 1904.049  1874.601--383.010   397.039 
 1904.234  1884.478--383.210   397.039 
 1907.394  1890.207--383.210   397.039 
 1905.385  1893.243--383.197   397.039 
 1907.052  1895.544--383.197   397.039 
 1902.848  1896.588--383.197   397.039 
 1898.601  1896.840--383.197   397.039


> Reduce memory use of TIFF readers
> -
>
> Key: SANSELAN-76
> URL: https://issues.apache.org/jira/browse/SANSELAN-76
> Project: Commons Sanselan
>  Issue Type: Improvement
>  Components: Format: TIFF
>Reporter: Gary Lucas
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> This Tracker Item proposes changes to the TIFF file readers to address memory 
> issues when reading very large images from TIFF files.  The TIFF format is 
> used extensively in technical applications such as aerial photographs, 
> satellite images, and digital raster maps which feature very large image 
> sizes.  For example, the public-domain Natural Earth Data set features raster 
> files sized 21,600 by 10,800 pixels (222.5 megapixels).   Although this 
> example is unusually large, image sizes of 25 to 100 megapixels are common 
> for such applications.
> Unfortunately, when Sanselan reads a TIFF image, it consumes nearly twice as 
> much memory as is necessary.  The reader operates in two stages. First, it 
> reads the entire source file into memory then it builds the output image, 
> also in memory.   In the example file mentioned above, the source data runs 
> from 83.19 to 373 megabytes (depending on compression).   Thus Sanselan would 
> require a minimum of 83.19+4*222.5 = 985 megabytes to produce an image for 
> one of these files (allowing 4 bytes per pixel in the output BufferedImage)
> Fortunately, TIFF files are organized so that they can be read a piece at a 
> time.  TIFF files are divided into either strips or tiles and, if data 
> compression is used, each piece is compressed individually.  Thus each 
> individual piece has no dependency on the other. 
> This item proposes to implement two changes:
> 1)  Allow the TIFF data reader to read the files one piece at a time while 
> constructing the buffered image.  Thus the memory use for reading would be no 
> larger than the piece size.  This would be an internal change, so the 
> external appearance of the Sanselan getBufferedImage methods would not change.
> 2) Provide new API elements that permit applications to read the strips or 
> tiles from TIFF files individually. This change would support 
> applications that needed to access very large TIFF files without committing 
> the memory to store a BufferedImage for the entire file (a 222.5 megapixel 
> image requires 890 megabytes, which is a lot even by contemporary standards).
> There is one minor issue in this implementation that is easily addressed.  
> Sanselan reads images from ByteSources that can be either random-access files 
> or sequential-access input streams.  In the case of sequential-input streams, 
> it may be 

[jira] [Issue Comment Edited] (SANSELAN-76) Reduce memory use of TIFF readers

2012-04-28 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/SANSELAN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264443#comment-13264443
 ] 

Gary Lucas edited comment on SANSELAN-76 at 4/29/12 12:40 AM:
--

I've basically completed the first of the two changes proposed above.  This 
change reduces the amount of memory needed to load a TIFF image.   I think it 
would probably be best if I treated the second proposed change as a different 
tracker item (so as to keep the scope of this change small and managable).

The following shows a comparison of the Before and After versions.  Both the 
amount of memory used for live objects and for the JVM as a whole are 
significantly reduced.

 
This is tested using a 1-by-1 TIFF file from the U.S. Geological 
Survey.  The source file uses the 24-bit RGB format and is 286.2 MB.  The 
output image is 381 MB. in size.  Memory stats are extracted from the Java 
Runtime class and collected before the TiffParser goes out-of-scope.


Before change
time to load image --memory
 time ms  avg ms   -- used mb   total mb
 2391.575 0.000--670.951  1015.375 
 1797.042 0.000--675.350  1160.000 
 1703.935  1703.935--670.298  1045.633 
 1924.843  1814.389--671.955  1015.188 
 1708.914  1779.231--672.305  1160.000 
 1687.799  1756.373--670.298  1045.633 
 1927.832  1790.665--672.176  1015.188 
 1794.254  1791.263--670.789  1160.000 
 1698.290  1777.981--670.298  1045.633 
 1928.838  1796.838--672.220  1015.188


After
 time to load image-- memory
 time ms  avg ms   --used mb   total mb
 2128.425 0.000--382.990   397.035 
 1823.000 0.000--528.913   568.723 
 1845.152  1845.152--413.471   695.723 
 1904.049  1874.601--383.010   397.039 
 1904.234  1884.478--383.210   397.039 
 1907.394  1890.207--383.210   397.039 
 1905.385  1893.243--383.197   397.039 
 1907.052  1895.544--383.197   397.039 
 1902.848  1896.588--383.197   397.039 
 1898.601  1896.840--383.197   397.039


  was (Author: gwlucas):

I've basically completed the first of the two changes proposed above.  This 
change reduces the amount of memory needed to load a TIFF image.   I think it 
would probably be best if I treated the second proposed change as a different 
tracker item (so as to keep the scope of this change small and managable).

The following shows a comparison of the Before and After versions.  Both the 
amount of memory used for live objects and for the JVM as a whole are 
significantly reduced.

{{monospaced}}
This is tested using a 1-by-1 TIFF file from the U.S. Geological 
Survey.  The source file uses the 24-bit RGB format and is 286.2 MB.  The 
output image is 381 MB. in size.  Memory stats are extracted from the Java 
Runtime class and collected before the TiffParser goes out-of-scope.


Before change
time to load image   memory
 time ms  avg ms used mb   total mb
 2391.575 0.000--670.951  1015.375 
 1797.042 0.000--675.350  1160.000 
 1703.935  1703.935--670.298  1045.633 
 1924.843  1814.389--671.955  1015.188 
 1708.914  1779.231--672.305  1160.000 
 1687.799  1756.373--670.298  1045.633 
 1927.832  1790.665--672.176  1015.188 
 1794.254  1791.263--670.789  1160.000 
 1698.290  1777.981--670.298  1045.633 
 1928.838  1796.838--672.220  1015.188


After
 time to load image   memory
 time ms  avg ms used mb   total mb
 2128.425 0.000--382.990   397.035 
 1823.000 0.000--528.913   568.723 
 1845.152  1845.152--413.471   695.723 
 1904.049  1874.601--383.010   397.039 
 1904.234  1884.478--383.210   397.039 
 1907.394  1890.207--383.210   397.039 
 1905.385  1893.243--383.197   397.039 
 1907.052  1895.544--383.197   397.039 
 1902.848  1896.588--383.197   397.039 
 1898.601  1896.840--383.197   397.039

  
> Reduce memory use of TIFF readers
> -
>
> Key: SANSELAN-76
> URL: https://issues.apache.org/jira/browse/SANSELAN-76
> Project: Commons Sanselan
>  Issue Type: Improvement
>  Components: Format: TIFF
>Reporter: Gary Lucas
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> This Tracker Item proposes changes to the TIFF file readers to address memory 
> issues when reading very large images from TIFF files.  The TIFF format is 
> used extensively in technical applications such as aerial photographs, 
> satellite images, and digital raster maps which feature very large image 
> sizes.  For example, the public-domain 

[jira] [Issue Comment Edited] (SANSELAN-76) Reduce memory use of TIFF readers

2012-04-28 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/SANSELAN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264443#comment-13264443
 ] 

Gary Lucas edited comment on SANSELAN-76 at 4/29/12 12:43 AM:
--

I've basically completed the first of the two changes proposed above.  This 
change reduces the amount of memory needed to load a TIFF image.   I think it 
would probably be best if I treated the second proposed change as a different 
tracker item (so as to keep the scope of this change small and managable).

The following shows a comparison of the Before and After versions.  Both the 
amount of memory used for live objects and for the JVM as a whole are 
significantly reduced.

 
This is tested using a 1-by-1 TIFF file from the U.S. Geological 
Survey.  The source file uses the 24-bit RGB format and is 286.2 MB.  The 
output image is 381 MB. in size.  Memory stats are extracted from the Java 
Runtime class and collected before the TiffParser goes out-of-scope.

{code}

Before change
time to load image --memory
 time ms  avg ms   -- used mb   total mb
 2391.575 0.000--670.951  1015.375 
 1797.042 0.000--675.350  1160.000 
 1703.935  1703.935--670.298  1045.633 
 1924.843  1814.389--671.955  1015.188 
 1708.914  1779.231--672.305  1160.000 
 1687.799  1756.373--670.298  1045.633 
 1927.832  1790.665--672.176  1015.188 
 1794.254  1791.263--670.789  1160.000 
 1698.290  1777.981--670.298  1045.633 
 1928.838  1796.838--672.220  1015.188


After
 time to load image-- memory
 time ms  avg ms   --used mb   total mb
 2128.425 0.000--382.990   397.035 
 1823.000 0.000--528.913   568.723 
 1845.152  1845.152--413.471   695.723 
 1904.049  1874.601--383.010   397.039 
 1904.234  1884.478--383.210   397.039 
 1907.394  1890.207--383.210   397.039 
 1905.385  1893.243--383.197   397.039 
 1907.052  1895.544--383.197   397.039 
 1902.848  1896.588--383.197   397.039 
 1898.601  1896.840--383.197   397.039

{code}

  was (Author: gwlucas):
I've basically completed the first of the two changes proposed above.  This 
change reduces the amount of memory needed to load a TIFF image.   I think it 
would probably be best if I treated the second proposed change as a different 
tracker item (so as to keep the scope of this change small and managable).

The following shows a comparison of the Before and After versions.  Both the 
amount of memory used for live objects and for the JVM as a whole are 
significantly reduced.

 
This is tested using a 1-by-1 TIFF file from the U.S. Geological 
Survey.  The source file uses the 24-bit RGB format and is 286.2 MB.  The 
output image is 381 MB. in size.  Memory stats are extracted from the Java 
Runtime class and collected before the TiffParser goes out-of-scope.


Before change
time to load image --memory
 time ms  avg ms   -- used mb   total mb
 2391.575 0.000--670.951  1015.375 
 1797.042 0.000--675.350  1160.000 
 1703.935  1703.935--670.298  1045.633 
 1924.843  1814.389--671.955  1015.188 
 1708.914  1779.231--672.305  1160.000 
 1687.799  1756.373--670.298  1045.633 
 1927.832  1790.665--672.176  1015.188 
 1794.254  1791.263--670.789  1160.000 
 1698.290  1777.981--670.298  1045.633 
 1928.838  1796.838--672.220  1015.188


After
 time to load image-- memory
 time ms  avg ms   --used mb   total mb
 2128.425 0.000--382.990   397.035 
 1823.000 0.000--528.913   568.723 
 1845.152  1845.152--413.471   695.723 
 1904.049  1874.601--383.010   397.039 
 1904.234  1884.478--383.210   397.039 
 1907.394  1890.207--383.210   397.039 
 1905.385  1893.243--383.197   397.039 
 1907.052  1895.544--383.197   397.039 
 1902.848  1896.588--383.197   397.039 
 1898.601  1896.840--383.197   397.039

  
> Reduce memory use of TIFF readers
> -
>
> Key: SANSELAN-76
> URL: https://issues.apache.org/jira/browse/SANSELAN-76
> Project: Commons Sanselan
>  Issue Type: Improvement
>  Components: Format: TIFF
>Reporter: Gary Lucas
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> This Tracker Item proposes changes to the TIFF file readers to address memory 
> issues when reading very large images from TIFF files.  The TIFF format is 
> used extensively in technical applications such as aerial photographs, 
> satellite images, and digital raster maps which feature very large image 
> sizes.  For example, the public-domain

[jira] [Created] (SANSELAN-78) Improve speed of random-access-file handling for TIFF format, potentially others

2012-04-30 Thread Gary Lucas (JIRA)
Gary Lucas created SANSELAN-78:
--

 Summary: Improve speed of random-access-file handling for TIFF 
format, potentially others
 Key: SANSELAN-78
 URL: https://issues.apache.org/jira/browse/SANSELAN-78
 Project: Commons Sanselan
  Issue Type: Improvement
  Components: Format: TIFF
Reporter: Gary Lucas



Large TIFF files can be organized into chunks (either strips or tiles) so that 
the image can be read a piece-at-a-time.  In the Apache Imaging implementation, 
each time one of these pieces is read, the TiffReader uses the getBlock() 
method of the ByteSourceFile class.  This class opens the file using the Java 
RandomAccessFile class, seeks to the position of the data in the file, reads 
its content, and closes the file.   Although this operation can be performed 
several times and thus entails a lot of redundant file opens and reads, the 
file cache performance on modern computers is truly amazing and for files of 
less than 5 megabytes, it often doesn't make a difference.   On larger files, 
however, it can be significant.

This Tracker Item proposes to modify the ByteSourceFile class so that an access 
routine can optionally hold the file open between getBlock() method calls.   It 
will accomplish this by adding a new method called .setPersistent(boolean).  By 
default, persistence will be set to false and the ByteSourceFile class will 
continue to work just as it always has (existing code will not be affected).  
If persistence is set to true, the RandomAccessFile will be held open.

To get some sense of the performance difference, I ran several tests.  For the 
sample  "ron and andy.tif" file provided with the Apache Imaging package, which 
is under 5 megabytes, the change made little difference.   However, when I 
tested with a larger files, such as the Apache Imaging sample 2560-by-1920 
pixel  PICT2833.TIF file (a blurry picture of a pretty girl), and a 
2500-by-2500 pixel file I downloaded from the US Geological Survey (USGS), I 
saw notable differences.  

I also tested on a fast local disk (my PC) and on a network disk.  Not 
surprisingly, the network disk showed the biggest change (in order to keep the 
test environment clean, I ran the network test early in the morning when the 
network was lightly used).

As you can see in the tests below on the local disk the savings is modest even 
for the largest file.  However, when dealing with a network file system, the 
change becomes significant.

{code}
ron and andy.tif   1500-by-1125   4.8 MB   
local  original: 25.9 ms.   
local  modified: 24.8 ms.
network original:   122.7 ms.
network modified:   117.6 ms.

PICT2833.TIF   2560-by-1920  14.1 MB
local  original: 77.7 ms.   
local  modified: 61.7 ms.
network original:   774.1 ms.
network modified:   463.8 ms.

USGS1   2500-by-2500   18.8 MB
local  original:192.3 ms.   
local  modified: 94.5 ms.
network original:  3992.8 ms.
network modified:  1807.1 ms.

USGS2  1-by-1  286 MB
local  original:   1930.5 ms.   
local  modified:   1344.5 ms.
network original: 26627.6 ms.
network modified: 13402.1 ms.

{code}
One consequence of this change is that if persistence is set to true, the file 
will be held open until the ByteSourceFile goes out-of-scope and is garbage 
collected.  So this change will also make sure that the TiffReader sets the 
persistence back to false when it is done reading the file in order to expedite 
the release of file resources.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (SANSELAN-76) Reduce memory use of TIFF readers

2012-05-05 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/SANSELAN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated SANSELAN-76:
---

Attachment: Tracker_76_Test_5_May_2012.patch

Patch showing changes

> Reduce memory use of TIFF readers
> -
>
> Key: SANSELAN-76
> URL: https://issues.apache.org/jira/browse/SANSELAN-76
> Project: Commons Sanselan
>  Issue Type: Improvement
>  Components: Format: TIFF
>Reporter: Gary Lucas
> Attachments: Tracker_76_Test_5_May_2012.patch
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> This Tracker Item proposes changes to the TIFF file readers to address memory 
> issues when reading very large images from TIFF files.  The TIFF format is 
> used extensively in technical applications such as aerial photographs, 
> satellite images, and digital raster maps which feature very large image 
> sizes.  For example, the public-domain Natural Earth Data set features raster 
> files sized 21,600 by 10,800 pixels (222.5 megapixels).   Although this 
> example is unusually large, image sizes of 25 to 100 megapixels are common 
> for such applications.
> Unfortunately, when Sanselan reads a TIFF image, it consumes nearly twice as 
> much memory as is necessary.  The reader operates in two stages. First, it 
> reads the entire source file into memory then it builds the output image, 
> also in memory.   In the example file mentioned above, the source data runs 
> from 83.19 to 373 megabytes (depending on compression).   Thus Sanselan would 
> require a minimum of 83.19+4*222.5 = 985 megabytes to produce an image for 
> one of these files (allowing 4 bytes per pixel in the output BufferedImage)
> Fortunately, TIFF files are organized so that they can be read a piece at a 
> time.  TIFF files are divided into either strips or tiles and, if data 
> compression is used, each piece is compressed individually.  Thus each 
> individual piece has no dependency on the other. 
> This item proposes to implement two changes:
> 1)  Allow the TIFF data reader to read the files one piece at a time while 
> constructing the buffered image.  Thus the memory use for reading would be no 
> larger than the piece size.  This would be an internal change, so the 
> external appearance of the Sanselan getBufferedImage methods would not change.
> 2) Provide new API elements that permit applications to read the strips or 
> tiles from TIFF files individually. This change would support 
> applications that needed to access very large TIFF files without committing 
> the memory to store a BufferedImage for the entire file (a 222.5 megapixel 
> image requires 890 megabytes, which is a lot even by contemporary standards).
> There is one minor issue in this implementation that is easily addressed.  
> Sanselan reads images from ByteSources that can be either random-access files 
> or sequential-access input streams.  In the case of sequential-input streams, 
> it may be hard to perform a partial read on a TIFF directory.  In such a 
> case, the TIFF access routines might have to resort to reading the entire 
> source data into memory as it currently does.   This would simply be a 
> limitation of the implementation.
> There is one issue that may make this change a bit problematic.  The TIFF 
> processors depend on accessing a class called TiffDataElement that contains a 
> public array of bytes called "data".   The most expeditious way of 
> implementing the enchancement is to make this element private and add an 
> accessor that either returns the data from internal memory or else loads it 
> on-demand.  Unfortunately, because the data element is scoped to public, 
> there is a chance that some existing applications are using it directly.   In 
> hindsight, it is clear that scoping this element as public was a mistake, but 
> it may be too late to fix it.  So care will be required to ensure that 
> compatibility remains.   The most likely solution seems to be to implement a 
> new class for passing raw data from the source TIFF files to the DataReader 
> implementations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (SANSELAN-76) Reduce memory use of TIFF readers

2012-05-05 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/SANSELAN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269102#comment-13269102
 ] 

Gary Lucas edited comment on SANSELAN-76 at 5/6/12 1:59 AM:


I prepared a patch showing the changes that produced the reduction in memory 
use I included in my earlier comments.   The changes involve two classes, 
DataReaderStrips.java and DataReaderTiles.java, that I had previously modified 
for the still pending patches I submitted for tracker item 58.   In order to 
keep the work separate, I backed out the changes from item 58 and made sure I 
worked on pristine versions of the classes from the Apache Imaging development 
trunk.  

The down side to doing that is that now the two tracker items represent 
parallel versions of the code.  I am highly motivated to get these changes into 
the code base because they permit me to access large TIFF files that were 
previously unreadable for my application due to memory use.  So let me know if 
you need me do prepare additional patches for submission.

  

  was (Author: gwlucas):
Patch showing changes
  
> Reduce memory use of TIFF readers
> -
>
> Key: SANSELAN-76
> URL: https://issues.apache.org/jira/browse/SANSELAN-76
> Project: Commons Sanselan
>  Issue Type: Improvement
>  Components: Format: TIFF
>Reporter: Gary Lucas
> Attachments: Tracker_76_Test_5_May_2012.patch
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> This Tracker Item proposes changes to the TIFF file readers to address memory 
> issues when reading very large images from TIFF files.  The TIFF format is 
> used extensively in technical applications such as aerial photographs, 
> satellite images, and digital raster maps which feature very large image 
> sizes.  For example, the public-domain Natural Earth Data set features raster 
> files sized 21,600 by 10,800 pixels (222.5 megapixels).   Although this 
> example is unusually large, image sizes of 25 to 100 megapixels are common 
> for such applications.
> Unfortunately, when Sanselan reads a TIFF image, it consumes nearly twice as 
> much memory as is necessary.  The reader operates in two stages. First, it 
> reads the entire source file into memory then it builds the output image, 
> also in memory.   In the example file mentioned above, the source data runs 
> from 83.19 to 373 megabytes (depending on compression).   Thus Sanselan would 
> require a minimum of 83.19+4*222.5 = 985 megabytes to produce an image for 
> one of these files (allowing 4 bytes per pixel in the output BufferedImage)
> Fortunately, TIFF files are organized so that they can be read a piece at a 
> time.  TIFF files are divided into either strips or tiles and, if data 
> compression is used, each piece is compressed individually.  Thus each 
> individual piece has no dependency on the other. 
> This item proposes to implement two changes:
> 1)  Allow the TIFF data reader to read the files one piece at a time while 
> constructing the buffered image.  Thus the memory use for reading would be no 
> larger than the piece size.  This would be an internal change, so the 
> external appearance of the Sanselan getBufferedImage methods would not change.
> 2) Provide new API elements that permit applications to read the strips or 
> tiles from TIFF files individually. This change would support 
> applications that needed to access very large TIFF files without committing 
> the memory to store a BufferedImage for the entire file (a 222.5 megapixel 
> image requires 890 megabytes, which is a lot even by contemporary standards).
> There is one minor issue in this implementation that is easily addressed.  
> Sanselan reads images from ByteSources that can be either random-access files 
> or sequential-access input streams.  In the case of sequential-input streams, 
> it may be hard to perform a partial read on a TIFF directory.  In such a 
> case, the TIFF access routines might have to resort to reading the entire 
> source data into memory as it currently does.   This would simply be a 
> limitation of the implementation.
> There is one issue that may make this change a bit problematic.  The TIFF 
> processors depend on accessing a class called TiffDataElement that contains a 
> public array of bytes called "data".   The most expeditious way of 
> implementing the enchancement is to make this element private and add an 
> accessor that either returns the data from internal memory or else loads it 
> on-demand.  Unfortunately, because the data element is scoped to public, 
> there is a chance that some existing applications are using it directly.   In 
> hindsight, it is clear that scoping this element as public was a mistake, but 
> it may be too late to fix it.  So care will be req

[jira] [Created] (SANSELAN-79) Include a test utility for timing and memory in project example classes

2012-05-05 Thread Gary Lucas (JIRA)
Gary Lucas created SANSELAN-79:
--

 Summary: Include a test utility for timing and memory in project 
example classes
 Key: SANSELAN-79
 URL: https://issues.apache.org/jira/browse/SANSELAN-79
 Project: Commons Sanselan
  Issue Type: New Feature
Reporter: Gary Lucas
Priority: Minor


For the convenience of developers, I've written a little test class for 
measuring the time and memory required to load a file using Apache Imaging.  
Code can be modified for use for other purposes (such as writing files, etc.).  
  I propse that this class be added to the examples in the Apache Imaging code 
distribution.

The Java file I've included with this upload includes a lot of explanation of 
why I do certain things when I'm testing.  I don't claim that it's the last 
word in testing procedures.  Really, it's more of a first word and lead to 
further discussion and even more useful tools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (SANSELAN-79) Include a test utility for timing and memory in project example classes

2012-05-05 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/SANSELAN-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated SANSELAN-79:
---

Attachment: ApacheImagingSpeedAndMemoryTest.java

> Include a test utility for timing and memory in project example classes
> ---
>
> Key: SANSELAN-79
> URL: https://issues.apache.org/jira/browse/SANSELAN-79
> Project: Commons Sanselan
>  Issue Type: New Feature
>Reporter: Gary Lucas
>Priority: Minor
> Attachments: ApacheImagingSpeedAndMemoryTest.java
>
>
> For the convenience of developers, I've written a little test class for 
> measuring the time and memory required to load a file using Apache Imaging.  
> Code can be modified for use for other purposes (such as writing files, 
> etc.).I propse that this class be added to the examples in the Apache 
> Imaging code distribution.
> The Java file I've included with this upload includes a lot of explanation of 
> why I do certain things when I'm testing.  I don't claim that it's the last 
> word in testing procedures.  Really, it's more of a first word and lead to 
> further discussion and even more useful tools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (SANSELAN-80) Incorrect code for tiled TIFF files applyPredictor method

2012-05-06 Thread Gary Lucas (JIRA)
Gary Lucas created SANSELAN-80:
--

 Summary: Incorrect code for tiled TIFF files applyPredictor method 
 Key: SANSELAN-80
 URL: https://issues.apache.org/jira/browse/SANSELAN-80
 Project: Commons Sanselan
  Issue Type: Bug
  Components: Format: TIFF
Reporter: Gary Lucas
Priority: Minor


I believe that the DataReaderTiled class used for reading tiled TIFF files 
invokes the applyPredictor method with incorrect arguments and will not be able 
to properly decode TIFF files that use predictors.  The bug was found during a 
code inspection. Unfortunately, I do not have any samples of data in this 
format (there are none in the Apache Imaging test files) and cannot verify that 
this is the case.

Some Background
TIFF files are often used to store images in technical applications where data 
must be faithfully preserved, so lossy compression methods like JPEG are 
inappropriate and non-lossy method like LZW must be used. However, continuous 
tone images like satellite images or photographs often do not compress well 
since there is little apparent redundancy in the data. To improve the 
redundancy of the data, TIFF uses a simple predictor.  The first pixel (gray 
tone or RGB value) in a tile is stored as a literal value.  All subsequent 
pixels are stored as differences.  To see how this works, imagine a monochrome 
picture where the gray tones gradually fade from white to black at a steady 
rate. Although no particular data value is ever repeated (so there is little 
apparent redundancy in the source data) the delta values remain constant (so a 
set of delta values will compress very well). When transformed in this matter, 
certain images show substantial improvements in compression ratio.

The Probem
The DataReaderTiff class uses a method called applyPredictor that takes an 
argument telling it whether the sample passed in is the first value, and should 
be treated as a literal, or whether it is a subsequent value and should be 
treated as a delta.   Unfortunately, the parameter it uses is the x coordinate 
of the pixel to be decoded.  While this approach works for TIFF strip files 
(where the first pixel always has a coordinate of zero), it does not work for 
tiles where the first pixel in the tile could fall anywhere in the image. 
  
The Fix
While we could simply fix the argument passed into the predictor, there is a 
better solution. The predictor performs an if/then operation on the input 
parameter to find out if it is the first sample in the tile. Once it unpacks a 
sample, it retains it as the "last" value so that it may be added to the next 
delta value.  Why not simply get rid of the if/then operation and just ensure 
that the last value gets zeroed out before beginning the processing of a strip 
or tile.  This would save an if/then operation and fix the bug.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (IMAGING-69) Streamlined TIFF strip reader reduces load time by a factor of 5

2012-05-09 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMAGING-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-69:
--

Attachment: ApacheImagingTrackerItem69_May_9_2012.patch

This patch supersedes previous files.  It is built on top of the recently added 
changes for tracker items 33 and 70 (under the new Apache Imaging numbering 
schemes).  It addresses both the strip-reader and the tile reader.   I have 
also added better comments in the code to give a more clear explanation of the 
changes.

> Streamlined TIFF strip reader reduces load time by a factor of 5
> 
>
> Key: IMAGING-69
> URL: https://issues.apache.org/jira/browse/IMAGING-69
> Project: Apache Commons Imaging
>  Issue Type: Improvement
>Reporter: Gary Lucas
> Attachments: ApacheImagingTrackerItem69_May_9_2012.patch, 
> Sanselan-58-TiffStripReaderSpeed.patch, Tracker_Item_58_22_Apr_2012.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Testing reveals that streamlining the DataReaderStrip.java operations for 8 
> and 24 bit-per-pixel TIFF images reduces the TIFF file load time by a factor 
> of 5.  
> For each pixel in images of these types, the interpretStrip() method of 
> DataReaderStrip makes calls to a generic bit extractor using its 
> getSamplesAsBytes() method.  Internally, this method simply copies the 
> requisite number of bytes (8 or 24), but it executes a lot of conditional 
> statements to do so.  Under most architectures, conditionals tend to take 2 
> to 3 times as long to execute as simple arithmetic statements, so this 
> approach is expensive (especially since an image may contain millions of 
> pixels).  While the implementation is very generic, the majority of TIFF 
> files out there appear to fall into two simple categories.  By implementing 
> specialized code for these two cases, the loading time for TIFF images is 
> dramatically reduced.
> The following snippet shows the code I used for testing.  It was added right 
> at the beginning of the interpretStrip() method.
>  // Oct 2011 changes.
> //  The general case decoder is based on the idea of using a 
> //  generic bit-reader to unpack the number of bytes that are
> //  needed.  Although it is efficiently implemented, it does
> //  require performing at least three conditional branches per sample
> //  extracted (and often more).   This change attempts to bypass that
> //  overhead by implementing specialized blocks of extraction code
> //  for commonly used 8 bitsPerPixel and 24 bitsPerPixel cases.
> //  In other cases, it will simply fall through to the original code.
> //note that when promoting a byte to an integer, it is necessary
> //to mask it with 0xff because the Java byte type is signed
> //an this implementation requires an unsigned value
> if(x>=width)
> {
> // this may not be required.  it was coded based on the 
> // original implementation.  But looking at the TIFF 6.0 spec,
> // it looks like the rows always evenly fill out the strip,
> // so there should never be a partial row in a strip and x
> // should not be anything except zero.
> x = 0;
> y++;
> }
> if(y>=height)
> {
> // we check it once before starting, so that we don't have
> // to check it redundantly for each pixel
> return;
> }
> 
> if(predictor==-1 && this.bitsPerPixel==8)
> {
> int [] samples = new int[1];
> for(int i=0; i {
> samples[0] = bytes[i]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> } 
> return;
> }
> else if(predictor==-1 && this.bitsPerPixel==24)
> {
> int [] samples = new int[3];
> int k = 0;
> for(int i=0; i {
> samples[0] = bytes[k++]&0x00ff;
> samples[1] = bytes[k++]&0x00ff;
> samples[2] = bytes[k++]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
>  

[jira] [Commented] (IMAGING-33) Incorrect code for tiled TIFF files applyPredictor method

2012-05-09 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/IMAGING-33?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271490#comment-13271490
 ] 

Gary Lucas commented on IMAGING-33:
---

Thanks.   Nice job on the changes, they're exactly what I had in mind...   Now 
the code is both more correct than it was before and also a little bit faster 
as well. 

> Incorrect code for tiled TIFF files applyPredictor method 
> --
>
> Key: IMAGING-33
> URL: https://issues.apache.org/jira/browse/IMAGING-33
> Project: Apache Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Reporter: Gary Lucas
>Priority: Minor
> Fix For: 1.0
>
>
> I believe that the DataReaderTiled class used for reading tiled TIFF files 
> invokes the applyPredictor method with incorrect arguments and will not be 
> able to properly decode TIFF files that use predictors.  The bug was found 
> during a code inspection. Unfortunately, I do not have any samples of data in 
> this format (there are none in the Apache Imaging test files) and cannot 
> verify that this is the case.
> Some Background
> TIFF files are often used to store images in technical applications where 
> data must be faithfully preserved, so lossy compression methods like JPEG are 
> inappropriate and non-lossy method like LZW must be used. However, continuous 
> tone images like satellite images or photographs often do not compress well 
> since there is little apparent redundancy in the data. To improve the 
> redundancy of the data, TIFF uses a simple predictor.  The first pixel (gray 
> tone or RGB value) in a tile is stored as a literal value.  All subsequent 
> pixels are stored as differences.  To see how this works, imagine a 
> monochrome picture where the gray tones gradually fade from white to black at 
> a steady rate. Although no particular data value is ever repeated (so there 
> is little apparent redundancy in the source data) the delta values remain 
> constant (so a set of delta values will compress very well). When transformed 
> in this matter, certain images show substantial improvements in compression 
> ratio.
> The Probem
> The DataReaderTiff class uses a method called applyPredictor that takes an 
> argument telling it whether the sample passed in is the first value, and 
> should be treated as a literal, or whether it is a subsequent value and 
> should be treated as a delta.   Unfortunately, the parameter it uses is the x 
> coordinate of the pixel to be decoded.  While this approach works for TIFF 
> strip files (where the first pixel always has a coordinate of zero), it does 
> not work for tiles where the first pixel in the tile could fall anywhere in 
> the image. 
>   
> The Fix
> While we could simply fix the argument passed into the predictor, there is a 
> better solution. The predictor performs an if/then operation on the input 
> parameter to find out if it is the first sample in the tile. Once it unpacks 
> a sample, it retains it as the "last" value so that it may be added to the 
> next delta value.  Why not simply get rid of the if/then operation and just 
> ensure that the last value gets zeroed out before beginning the processing of 
> a strip or tile.  This would save an if/then operation and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (IMAGING-66) proposed enhancement reduces load time for some image files by 40 percent

2012-05-09 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMAGING-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas closed IMAGING-66.
-

Resolution: Fixed

Since Damjan applied the change I proposed to all the TIFF parsing related 
routines and a number of classes related to other formats, I am closing this 
issue.  There are other places where the BufferedImage setRGB method is still 
used, but those are probably best treated as a separate tracker item.

The performance enhancements described under tracker item 69 are a separate 
issue.

> proposed enhancement reduces load time for some image files by 40 percent
> -
>
> Key: IMAGING-66
> URL: https://issues.apache.org/jira/browse/IMAGING-66
> Project: Apache Commons Imaging
>  Issue Type: Improvement
> Environment: Tested in Windows, Linux, MacOS
>Reporter: Gary Lucas
>  Labels: api-change
> Attachments: Sanselan-56-SpeedEnhanceTiff.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I have identified an enhancement that reduces the time required to load TIFF 
> image by 40 percent.  I have tested a modified version of Sanselan under 
> Windows, Linux, and MacOS with consistent savings on each platform.  
> Additionally, I suspect that this technique may be applicable to other areas 
> of the Sanselan code base, including more popular image formats supported by 
> Sanselan such as JPEG, PNG, etc.
> I propose to add the relevant code changes to the Sanselan code base.  Once 
> these modifications are in place, there would be an opportunity for others to 
> look at the pro's and cons' of applying the techniques to other data formats.
> The Enhancement
> To load an image from a TIFF file, Sanselan performs extensive data 
> processing in order to obtain RGB values for the pixels in the output image. 
> The code for that processing appears to be well written and efficient. Once 
> the RGB value are obtained, they are stored in a Java BufferedImage using a 
> call  to the setRGB() method.
> Unfortunately, setRGB() is an extremely inefficient method.   A much, much 
> better approach is to store the data into an integer array and defer the 
> creation of the buffered image until all information for the image has been 
> collected.Java has a nice (though somewhat obscure) API that lets memory 
> in an integer array be transferred directly to a BufferedImage so that the 
> system does not have to allocate additional memory for this procedure (a very 
> nice feature when dealing with huge images).  This change virtually 
> eliminated the overhead for transferring data to images, which accounted for 
> 40 percent of the time required to load images.  For TIFF files, this was a 
> reasonable approach because the TiffImageParser class always loads 4-byte 
> image  and the getGrayscaleBufferedImage() method is never used.  I have not 
> investigated the code for the other renders, but some refinement might be 
> needed for the one-byte grayscale images.
> Steps to Integration
> In sanselan.common, a new class called ImagePrep was created.  ImagePrep 
> carries a width, height, and an integer array for storing pixels.  It 
> provides its own setRGB() method which looks just like the one in 
> BufferedImage.   Finally, it provides a method called getBufferedImage() 
> which creates a BufferedImage from its internal the integer array when the 
> processing is complete.
> In the TiffImageParser classes, data is read from input stream and 
> transferred to pixel values in a series of classes known as 
> PhotometricInterpreters.  These were modified to operate on ImagePrep objects 
> rather than BufferedImage objects.  The DataReader and TiffImageParser 
> classes were modified to pass ImagePrep objects into the photometric 
> interpreters rather than using BufferedImages.
> At the very last step, before passing its result back to the calling method 
> (the Sanselan main class, etc.), the TiffImageParser used the 
> ImagePrep.getBufferedImage() to convert the result to the expected form.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (IMAGING-69) Streamlined TIFF strip reader reduces load time by a factor of 5

2012-05-31 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMAGING-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-69:
--

Attachment: ApacheImagingTrackerItem69_May_30_2012.patch

Damjan Jovanovic asked some questions regarding byte-order issues which led to 
this patch.  In a TIFF file, a pixel is represented by a set of "samples". For 
example, in a RGB model, the samples are the red, green, and blue values. The 
speed enhancements I proposed only worked when the samples were exactly one 
byte in size (which is usually the case, but there can be exceptions).  So I've 
added logic to check for that condition.

Once again, this patch supercedes my earlier submissions.

> Streamlined TIFF strip reader reduces load time by a factor of 5
> 
>
> Key: IMAGING-69
> URL: https://issues.apache.org/jira/browse/IMAGING-69
> Project: Apache Commons Imaging
>  Issue Type: Improvement
>Reporter: Gary Lucas
> Attachments: ApacheImagingTrackerItem69_May_30_2012.patch, 
> ApacheImagingTrackerItem69_May_9_2012.patch, 
> Sanselan-58-TiffStripReaderSpeed.patch, Tracker_Item_58_22_Apr_2012.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Testing reveals that streamlining the DataReaderStrip.java operations for 8 
> and 24 bit-per-pixel TIFF images reduces the TIFF file load time by a factor 
> of 5.  
> For each pixel in images of these types, the interpretStrip() method of 
> DataReaderStrip makes calls to a generic bit extractor using its 
> getSamplesAsBytes() method.  Internally, this method simply copies the 
> requisite number of bytes (8 or 24), but it executes a lot of conditional 
> statements to do so.  Under most architectures, conditionals tend to take 2 
> to 3 times as long to execute as simple arithmetic statements, so this 
> approach is expensive (especially since an image may contain millions of 
> pixels).  While the implementation is very generic, the majority of TIFF 
> files out there appear to fall into two simple categories.  By implementing 
> specialized code for these two cases, the loading time for TIFF images is 
> dramatically reduced.
> The following snippet shows the code I used for testing.  It was added right 
> at the beginning of the interpretStrip() method.
>  // Oct 2011 changes.
> //  The general case decoder is based on the idea of using a 
> //  generic bit-reader to unpack the number of bytes that are
> //  needed.  Although it is efficiently implemented, it does
> //  require performing at least three conditional branches per sample
> //  extracted (and often more).   This change attempts to bypass that
> //  overhead by implementing specialized blocks of extraction code
> //  for commonly used 8 bitsPerPixel and 24 bitsPerPixel cases.
> //  In other cases, it will simply fall through to the original code.
> //note that when promoting a byte to an integer, it is necessary
> //to mask it with 0xff because the Java byte type is signed
> //an this implementation requires an unsigned value
> if(x>=width)
> {
> // this may not be required.  it was coded based on the 
> // original implementation.  But looking at the TIFF 6.0 spec,
> // it looks like the rows always evenly fill out the strip,
> // so there should never be a partial row in a strip and x
> // should not be anything except zero.
> x = 0;
> y++;
> }
> if(y>=height)
> {
> // we check it once before starting, so that we don't have
> // to check it redundantly for each pixel
> return;
> }
> 
> if(predictor==-1 && this.bitsPerPixel==8)
> {
> int [] samples = new int[1];
> for(int i=0; i {
> samples[0] = bytes[i]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> } 
> return;
> }
> else if(predictor==-1 && this.bitsPerPixel==24)
> {
> int [] samples = new int[3];
> int k = 0;
> for(int i=0; i {
> samples[0] = bytes[k++]&0x00ff;
> samples[1] = bytes[k++]&0x00ff;
> samples[2] = bytes[k++]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> i

[jira] [Comment Edited] (IMAGING-69) Streamlined TIFF strip reader reduces load time by a factor of 5

2012-05-31 Thread Gary Lucas (JIRA)

[ 
https://issues.apache.org/jira/browse/IMAGING-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286521#comment-13286521
 ] 

Gary Lucas edited comment on IMAGING-69 at 5/31/12 12:27 PM:
-

Damjan Jovanovic asked some questions regarding byte-order issues which led to 
this patch.  In a TIFF file, a pixel is represented by a set of "samples". For 
example, in a RGB model, the samples are the red, green, and blue values. The 
speed enhancements I proposed only worked when the samples were exactly one 
byte in size (which is usually the case, but there can be exceptions).  So I've 
added logic to check for that condition.

Once again, this patch supersedes my earlier submissions.

  was (Author: gwlucas):
Damjan Jovanovic asked some questions regarding byte-order issues which led 
to this patch.  In a TIFF file, a pixel is represented by a set of "samples". 
For example, in a RGB model, the samples are the red, green, and blue values. 
The speed enhancements I proposed only worked when the samples were exactly one 
byte in size (which is usually the case, but there can be exceptions).  So I've 
added logic to check for that condition.

Once again, this patch supercedes my earlier submissions.
  
> Streamlined TIFF strip reader reduces load time by a factor of 5
> 
>
> Key: IMAGING-69
> URL: https://issues.apache.org/jira/browse/IMAGING-69
> Project: Apache Commons Imaging
>  Issue Type: Improvement
>Reporter: Gary Lucas
> Attachments: ApacheImagingTrackerItem69_May_30_2012.patch, 
> ApacheImagingTrackerItem69_May_9_2012.patch, 
> Sanselan-58-TiffStripReaderSpeed.patch, Tracker_Item_58_22_Apr_2012.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Testing reveals that streamlining the DataReaderStrip.java operations for 8 
> and 24 bit-per-pixel TIFF images reduces the TIFF file load time by a factor 
> of 5.  
> For each pixel in images of these types, the interpretStrip() method of 
> DataReaderStrip makes calls to a generic bit extractor using its 
> getSamplesAsBytes() method.  Internally, this method simply copies the 
> requisite number of bytes (8 or 24), but it executes a lot of conditional 
> statements to do so.  Under most architectures, conditionals tend to take 2 
> to 3 times as long to execute as simple arithmetic statements, so this 
> approach is expensive (especially since an image may contain millions of 
> pixels).  While the implementation is very generic, the majority of TIFF 
> files out there appear to fall into two simple categories.  By implementing 
> specialized code for these two cases, the loading time for TIFF images is 
> dramatically reduced.
> The following snippet shows the code I used for testing.  It was added right 
> at the beginning of the interpretStrip() method.
>  // Oct 2011 changes.
> //  The general case decoder is based on the idea of using a 
> //  generic bit-reader to unpack the number of bytes that are
> //  needed.  Although it is efficiently implemented, it does
> //  require performing at least three conditional branches per sample
> //  extracted (and often more).   This change attempts to bypass that
> //  overhead by implementing specialized blocks of extraction code
> //  for commonly used 8 bitsPerPixel and 24 bitsPerPixel cases.
> //  In other cases, it will simply fall through to the original code.
> //note that when promoting a byte to an integer, it is necessary
> //to mask it with 0xff because the Java byte type is signed
> //an this implementation requires an unsigned value
> if(x>=width)
> {
> // this may not be required.  it was coded based on the 
> // original implementation.  But looking at the TIFF 6.0 spec,
> // it looks like the rows always evenly fill out the strip,
> // so there should never be a partial row in a strip and x
> // should not be anything except zero.
> x = 0;
> y++;
> }
> if(y>=height)
> {
> // we check it once before starting, so that we don't have
> // to check it redundantly for each pixel
> return;
> }
> 
> if(predictor==-1 && this.bitsPerPixel==8)
> {
> int [] samples = new int[1];
> for(int i=0; i {
> samples[0] = bytes[i]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
>

[jira] [Updated] (IMAGING-69) Streamlined TIFF strip reader reduces load time by a factor of 5

2012-05-31 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMAGING-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-69:
--

Attachment: ApacheImagingTrackerItem69_May_30_2012.patch

As per request by Damjan, file attached with license granted.

> Streamlined TIFF strip reader reduces load time by a factor of 5
> 
>
> Key: IMAGING-69
> URL: https://issues.apache.org/jira/browse/IMAGING-69
> Project: Apache Commons Imaging
>  Issue Type: Improvement
>Reporter: Gary Lucas
> Attachments: ApacheImagingTrackerItem69_May_30_2012.patch, 
> ApacheImagingTrackerItem69_May_9_2012.patch, 
> Sanselan-58-TiffStripReaderSpeed.patch, Tracker_Item_58_22_Apr_2012.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Testing reveals that streamlining the DataReaderStrip.java operations for 8 
> and 24 bit-per-pixel TIFF images reduces the TIFF file load time by a factor 
> of 5.  
> For each pixel in images of these types, the interpretStrip() method of 
> DataReaderStrip makes calls to a generic bit extractor using its 
> getSamplesAsBytes() method.  Internally, this method simply copies the 
> requisite number of bytes (8 or 24), but it executes a lot of conditional 
> statements to do so.  Under most architectures, conditionals tend to take 2 
> to 3 times as long to execute as simple arithmetic statements, so this 
> approach is expensive (especially since an image may contain millions of 
> pixels).  While the implementation is very generic, the majority of TIFF 
> files out there appear to fall into two simple categories.  By implementing 
> specialized code for these two cases, the loading time for TIFF images is 
> dramatically reduced.
> The following snippet shows the code I used for testing.  It was added right 
> at the beginning of the interpretStrip() method.
>  // Oct 2011 changes.
> //  The general case decoder is based on the idea of using a 
> //  generic bit-reader to unpack the number of bytes that are
> //  needed.  Although it is efficiently implemented, it does
> //  require performing at least three conditional branches per sample
> //  extracted (and often more).   This change attempts to bypass that
> //  overhead by implementing specialized blocks of extraction code
> //  for commonly used 8 bitsPerPixel and 24 bitsPerPixel cases.
> //  In other cases, it will simply fall through to the original code.
> //note that when promoting a byte to an integer, it is necessary
> //to mask it with 0xff because the Java byte type is signed
> //an this implementation requires an unsigned value
> if(x>=width)
> {
> // this may not be required.  it was coded based on the 
> // original implementation.  But looking at the TIFF 6.0 spec,
> // it looks like the rows always evenly fill out the strip,
> // so there should never be a partial row in a strip and x
> // should not be anything except zero.
> x = 0;
> y++;
> }
> if(y>=height)
> {
> // we check it once before starting, so that we don't have
> // to check it redundantly for each pixel
> return;
> }
> 
> if(predictor==-1 && this.bitsPerPixel==8)
> {
> int [] samples = new int[1];
> for(int i=0; i {
> samples[0] = bytes[i]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> } 
> return;
> }
> else if(predictor==-1 && this.bitsPerPixel==24)
> {
> int [] samples = new int[3];
> int k = 0;
> for(int i=0; i {
> samples[0] = bytes[k++]&0x00ff;
> samples[1] = bytes[k++]&0x00ff;
> samples[2] = bytes[k++]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> }   
> return;
> }
> 
> // original code before Oct 2011 modification
> ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
> BitInputStream bis

[jira] [Updated] (IMAGING-69) Streamlined TIFF strip reader reduces load time by a factor of 5

2012-05-31 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMAGING-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-69:
--

Attachment: (was: ApacheImagingTrackerItem69_May_30_2012.patch)

> Streamlined TIFF strip reader reduces load time by a factor of 5
> 
>
> Key: IMAGING-69
> URL: https://issues.apache.org/jira/browse/IMAGING-69
> Project: Apache Commons Imaging
>  Issue Type: Improvement
>Reporter: Gary Lucas
> Attachments: ApacheImagingTrackerItem69_May_30_2012.patch, 
> ApacheImagingTrackerItem69_May_9_2012.patch, 
> Sanselan-58-TiffStripReaderSpeed.patch, Tracker_Item_58_22_Apr_2012.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Testing reveals that streamlining the DataReaderStrip.java operations for 8 
> and 24 bit-per-pixel TIFF images reduces the TIFF file load time by a factor 
> of 5.  
> For each pixel in images of these types, the interpretStrip() method of 
> DataReaderStrip makes calls to a generic bit extractor using its 
> getSamplesAsBytes() method.  Internally, this method simply copies the 
> requisite number of bytes (8 or 24), but it executes a lot of conditional 
> statements to do so.  Under most architectures, conditionals tend to take 2 
> to 3 times as long to execute as simple arithmetic statements, so this 
> approach is expensive (especially since an image may contain millions of 
> pixels).  While the implementation is very generic, the majority of TIFF 
> files out there appear to fall into two simple categories.  By implementing 
> specialized code for these two cases, the loading time for TIFF images is 
> dramatically reduced.
> The following snippet shows the code I used for testing.  It was added right 
> at the beginning of the interpretStrip() method.
>  // Oct 2011 changes.
> //  The general case decoder is based on the idea of using a 
> //  generic bit-reader to unpack the number of bytes that are
> //  needed.  Although it is efficiently implemented, it does
> //  require performing at least three conditional branches per sample
> //  extracted (and often more).   This change attempts to bypass that
> //  overhead by implementing specialized blocks of extraction code
> //  for commonly used 8 bitsPerPixel and 24 bitsPerPixel cases.
> //  In other cases, it will simply fall through to the original code.
> //note that when promoting a byte to an integer, it is necessary
> //to mask it with 0xff because the Java byte type is signed
> //an this implementation requires an unsigned value
> if(x>=width)
> {
> // this may not be required.  it was coded based on the 
> // original implementation.  But looking at the TIFF 6.0 spec,
> // it looks like the rows always evenly fill out the strip,
> // so there should never be a partial row in a strip and x
> // should not be anything except zero.
> x = 0;
> y++;
> }
> if(y>=height)
> {
> // we check it once before starting, so that we don't have
> // to check it redundantly for each pixel
> return;
> }
> 
> if(predictor==-1 && this.bitsPerPixel==8)
> {
> int [] samples = new int[1];
> for(int i=0; i {
> samples[0] = bytes[i]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> } 
> return;
> }
> else if(predictor==-1 && this.bitsPerPixel==24)
> {
> int [] samples = new int[3];
> int k = 0;
> for(int i=0; i {
> samples[0] = bytes[k++]&0x00ff;
> samples[1] = bytes[k++]&0x00ff;
> samples[2] = bytes[k++]&0x00ff;
> photometricInterpreter.interpretPixel(bi, samples, x, y);
> x++;
> if(x>=width)
> {
> x = 0;
> y++;
> if(y>=height)
> return; // any remaining bytes are not needed
> }
> }   
> return;
> }
> 
> // original code before Oct 2011 modification
> ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
> BitInputStream bis = new BitInputStream(bais);
> etc.

--
This message is automati

[jira] [Created] (IMAGING-81) Add more Javadoc to main package

2012-06-25 Thread Gary Lucas (JIRA)
Gary Lucas created IMAGING-81:
-

 Summary: Add more Javadoc to main package
 Key: IMAGING-81
 URL: https://issues.apache.org/jira/browse/IMAGING-81
 Project: Apache Commons Imaging
  Issue Type: Improvement
  Components: Documentation
Reporter: Gary Lucas
Assignee: Luc Maisonobe
Priority: Minor


The current version of the Apache Commons Imaging has minimal Javadoc.  While 
the task of completely supplying Javadoc for the package would easily require 
multiple man-months of effort, it would be useful to add information to at 
least the top-level package and main classes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (IMAGING-81) Add more Javadoc to main package

2012-06-25 Thread Gary Lucas (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMAGING-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-81:
--

Attachment: Lucas_Javadoc_25_June_2012.patch

The following adds Javadoc to Imaging and ImageParser, with at least 
class-level descriptions for all the main classes.  I've also scrubbed up some 
of the existing Javadoc in Imaging.java (which still referred to Sanselan) and 
added package.html documentation to some of the packages.   I hope that this 
documentation will make it easier for developers to work the the Imaging 
library.

> Add more Javadoc to main package
> 
>
> Key: IMAGING-81
> URL: https://issues.apache.org/jira/browse/IMAGING-81
> Project: Apache Commons Imaging
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Gary Lucas
>Assignee: Luc Maisonobe
>Priority: Minor
> Attachments: Lucas_Javadoc_25_June_2012.patch
>
>
> The current version of the Apache Commons Imaging has minimal Javadoc.  While 
> the task of completely supplying Javadoc for the package would easily require 
> multiple man-months of effort, it would be useful to add information to at 
> least the top-level package and main classes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (IMAGING-263) Failure when reading a partial raster from a floating-point TIFF

2020-08-12 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-263:
--

 Summary: Failure when reading a partial raster from a 
floating-point TIFF
 Key: IMAGING-263
 URL: https://issues.apache.org/jira/browse/IMAGING-263
 Project: Commons Imaging
  Issue Type: Bug
  Components: Format: TIFF
Affects Versions: 1.0-alpha2
Reporter: Gary Lucas
 Fix For: 1.0-alpha3


When reading a partial raster from a floating-point TIFF file that uses a tiled 
format, the read will fail if the starting position is beyond the first row of 
tiles.  The result will be a raster subset filled with zero values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-265) ArrayIndexOutOfBoundsException on reading simple GeoTIFF

2020-09-16 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-265:
---
Attachment: small_world_split.jpg

> ArrayIndexOutOfBoundsException on reading simple GeoTIFF
> 
>
> Key: IMAGING-265
> URL: https://issues.apache.org/jira/browse/IMAGING-265
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha2
>Reporter: edgar soldin
>Priority: Major
> Attachments: small_world.tif, small_world_split.jpg
>
>
> hi,
>  
> we on the OpenJUMP project cannot open some GeoTIFFs with commons.imaging . 
> for details you may find a ticket in our bug tracker 
> [https://sourceforge.net/p/jump-pilot/bugs/498/] .
>  
> the gist is: on loading the attached file getBufferedImage() fails with this 
> stack
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 8000Caused by: 
> java.lang.ArrayIndexOutOfBoundsException: 8000 at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.interpretStrip(DataReaderStrips.java:196)
>  at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.readImageData(DataReaderStrips.java:254)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:665)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffDirectory.getTiffImage(TiffDirectory.java:254)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:469)
>  at org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1442) at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1335) at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1304) at 
> com.vividsolutions.jump.workbench.imagery.graphic.CommonsImage.initImage(CommonsImage.java:108)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-265) ArrayIndexOutOfBoundsException on reading simple GeoTIFF

2020-09-16 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197298#comment-17197298
 ] 

Gary Lucas commented on IMAGING-265:


It took a while, but I've identified the problem and I think that the original 
authors of TIFF were in love with complexity.

It turns out that there is a specification in the TIFF standard on how a file 
may store the RGB components in a file.  The arrangement is specified using a 
TIFF tag known as Planar Configuration and there are two variations:

You may specify them in a sequence of bytes such as 
   R1, G1, B1, R2, G2, B2, R3, G3, B3,  ...  , Rn, Gn, Bn   (Planar 
Configuration 1)

But you can also split the data into separate "planes" such as
   R1, R1, R2,... Rn,G1, G2, G3,...Gn,B1, B2, B3, ...  Bn. (Planar 
Configuration 2)

The problematic image uses Planar Configuration 2.  Searching through the code, 
I don't find any evidence of Planar Configuration 2 implemented anywhere.  So 
it looks like this configuration is not currently supported by Commons Imaging. 
  

The array bounds failure occurs because the small_world.tif file is organized 
into blocks of 8000 bytes representing strips of 20 rows 400 pixels wide.  In 
Planar Configuration 1, the code would expect to see 3 bytes per pixel, or 
24000 bytes per strip (block). In Planar Configuration 2, the data is split 
between 3 blocks of 8000 which have to be combined to get a full set of RGB 
values.  Since Commons Imaging does not recognize Planar Configuration 2 (at 
least, not yet), it thinks it has 24000 bytes to process and wackiness ensues.

THE USUAL PLEA FOR TEST DATA
I have a fix in mind that will address this particular file. But it will only 
work for files using the RGB color model and organized into strips with no data 
compression.  I could easily extend it for files organized with tiles, but I 
have no test data.   And that still leaves the issue of different photometric 
interpreters and data compression. I'm not sure that these options even matter, 
because I suspect that Planar Configuration 2 is unusual in modern data systems.

For the OpenJUMP folks:  As a temporary work-around, you can try to use 
interleaved RGB rather than the separate planes.  I know that's not much of a 
help if you're not in control of the format for the images you receive, but at 
least you will know what some images don't work.

And finally, just as a point of interest, I've attached a JPEG showing the 
3-plane content of the small world file.  I hacked Commons Imaging so that it 
wouldn't crash when I read the file.  But, clearly, I have more work to do to 
render it properly.
 !small_world_split.jpg! 

> ArrayIndexOutOfBoundsException on reading simple GeoTIFF
> 
>
> Key: IMAGING-265
> URL: https://issues.apache.org/jira/browse/IMAGING-265
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha2
>Reporter: edgar soldin
>Priority: Major
> Attachments: small_world.tif, small_world_split.jpg
>
>
> hi,
>  
> we on the OpenJUMP project cannot open some GeoTIFFs with commons.imaging . 
> for details you may find a ticket in our bug tracker 
> [https://sourceforge.net/p/jump-pilot/bugs/498/] .
>  
> the gist is: on loading the attached file getBufferedImage() fails with this 
> stack
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 8000Caused by: 
> java.lang.ArrayIndexOutOfBoundsException: 8000 at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.interpretStrip(DataReaderStrips.java:196)
>  at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.readImageData(DataReaderStrips.java:254)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:665)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffDirectory.getTiffImage(TiffDirectory.java:254)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:469)
>  at org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1442) at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1335) at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1304) at 
> com.vividsolutions.jump.workbench.imagery.graphic.CommonsImage.initImage(CommonsImage.java:108)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-265) ArrayIndexOutOfBoundsException on reading simple GeoTIFF

2020-09-18 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197298#comment-17197298
 ] 

Gary Lucas edited comment on IMAGING-265 at 9/18/20, 12:29 PM:
---

It took a while, but I've identified the problem and I think that the original 
authors of TIFF were in love with complexity.

It turns out that there is a specification in the TIFF standard on how a file 
may store the RGB components in a file. The arrangement is specified using a 
TIFF tag known as Planar Configuration and there are two variations:

You may specify them in a sequence of bytes such as 
 R1, G1, B1, R2, G2, B2, R3, G3, B3, ... , Rn, Gn, Bn (Planar Configuration 1)

But you can also split the data into separate "planes" such as
 R1, R1, R2,... Rn, G1, G2, G3,...Gn, B1, B2, B3, ... Bn. (Planar Configuration 
2)

The problematic image uses Planar Configuration 2. Searching through the code, 
I don't find any evidence of Planar Configuration 2 implemented anywhere. So it 
looks like this configuration is not currently supported by Commons Imaging.

The array bounds failure occurs because the small_world.tif file is organized 
into blocks of 8000 bytes representing strips of 20 rows 400 pixels wide. In 
Planar Configuration 1, the code would expect to see 3 bytes per pixel, or 
24000 bytes per strip (block). In Planar Configuration 2, the data is split 
between 3 blocks of 8000 which have to be combined to get a full set of RGB 
values. Since Commons Imaging does not recognize Planar Configuration 2 (at 
least, not yet), it thinks it has 24000 bytes to process and wackiness ensues.

THE USUAL PLEA FOR TEST DATA
 I have a fix in mind that will address this particular file. But it will only 
work for files using the RGB color model and organized into strips with no data 
compression. I could easily extend it for files organized with tiles, but I 
have no test data. And that still leaves the issue of different photometric 
interpreters and data compression. I'm not sure that these options even matter, 
because I suspect that Planar Configuration 2 is unusual in modern data systems.

For the OpenJUMP folks: As a temporary work-around, you can try to use 
interleaved RGB rather than the separate planes. I know that's not much of a 
help if you're not in control of the format for the images you receive, but at 
least you will know why some images don't work.

And finally, just as a point of interest, I've attached a JPEG showing the 
3-plane content of the small world file. I hacked Commons Imaging so that it 
wouldn't crash when I read the file. But, clearly, I have more work to do to 
render it properly.
 !small_world_split.jpg!


was (Author: gwlucas):
It took a while, but I've identified the problem and I think that the original 
authors of TIFF were in love with complexity.

It turns out that there is a specification in the TIFF standard on how a file 
may store the RGB components in a file.  The arrangement is specified using a 
TIFF tag known as Planar Configuration and there are two variations:

You may specify them in a sequence of bytes such as 
   R1, G1, B1, R2, G2, B2, R3, G3, B3,  ...  , Rn, Gn, Bn   (Planar 
Configuration 1)

But you can also split the data into separate "planes" such as
   R1, R1, R2,... Rn,G1, G2, G3,...Gn,B1, B2, B3, ...  Bn. (Planar 
Configuration 2)

The problematic image uses Planar Configuration 2.  Searching through the code, 
I don't find any evidence of Planar Configuration 2 implemented anywhere.  So 
it looks like this configuration is not currently supported by Commons Imaging. 
  

The array bounds failure occurs because the small_world.tif file is organized 
into blocks of 8000 bytes representing strips of 20 rows 400 pixels wide.  In 
Planar Configuration 1, the code would expect to see 3 bytes per pixel, or 
24000 bytes per strip (block). In Planar Configuration 2, the data is split 
between 3 blocks of 8000 which have to be combined to get a full set of RGB 
values.  Since Commons Imaging does not recognize Planar Configuration 2 (at 
least, not yet), it thinks it has 24000 bytes to process and wackiness ensues.

THE USUAL PLEA FOR TEST DATA
I have a fix in mind that will address this particular file. But it will only 
work for files using the RGB color model and organized into strips with no data 
compression.  I could easily extend it for files organized with tiles, but I 
have no test data.   And that still leaves the issue of different photometric 
interpreters and data compression. I'm not sure that these options even matter, 
because I suspect that Planar Configuration 2 is unusual in modern data systems.

For the OpenJUMP folks:  As a temporary work-around, you can try to use 
interleaved RGB rather than the separate planes.  I know that's not much of a 
help if you're not in control of the format for the images you receive, but 

[jira] [Created] (IMAGING-266) Read integer data from GeoTIFFS

2020-09-18 Thread Gary Lucas (Jira)
Gary Lucas created IMAGING-266:
--

 Summary: Read integer data  from GeoTIFFS
 Key: IMAGING-266
 URL: https://issues.apache.org/jira/browse/IMAGING-266
 Project: Commons Imaging
  Issue Type: New Feature
  Components: Format: TIFF
Affects Versions: 1.0-alpha3
Reporter: Gary Lucas


I recently discovered that there is a large amount of digital elevation data 
available in the form of 16-bit integer coded data in GeoTIFF files (TIFF files 
with geographic tags).  I propose to enhance the Commons Imaging API to read 
these files.  This work will be similar to the work I did for reading 
floating-point raster data under ISSUE-251.

Available data include the nearly-global coverage of one-second of arc 
elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
other sources. These products give grids of elevation data with a 30 meter cell 
spacing for most of the world's land masses. They are available at NASA 
Earthdata and Japan Space Systems websites, see 
[https://asterweb.jpl.nasa.gov/gdem.asp.] There is also a ocean bathymetry data 
set available in this format at [http://www.shadedrelief.com/blue-earth/]

This new feature will continue to expand the usefulness of the Commons Imaging 
API in accessing GeoTIFF products.

Request for Feedback

So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
elevation and ocean depth data in meters recorded as a short integer.  I 
haven't found an example of where the 32-bit integer format is used.  For now, 
I am planning on only coding the 16-bit integer variation.  Does anyone know if 
the 32-bit version is worth supporting?  My criteria for determining that would 
be based on whether there is a significant number of projects using that format 
(life is too short to chase rarely used data formats).

Currently, one of the code-analysis operations conducted by the Commons Imaging 
build process is coverage by JUnit tests.  Lacking any test data for the 32-bit 
case, I am reluctant to include it in the code base because it would mean 
putting uncovered code into the distribution.

Also, I am wondering about the best design for the access API.  The current 
TiffImageParser class has a method called getFloatingPointRasterData() that 
returns an instance of TiffRasterData.  TiffRasterData is pretty much 
hard-wired to floating point data.  I am thinking of creating a new method 
called getIntegerRasterData() that would return an instance of a new class 
called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
to combine both kinds of results into a unified class and method, but it seems 
more unwieldy than useful. 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-267) Colorful rendering of b/w Monoband TIF

2020-09-21 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-267:
---
Attachment: ISSUE_267.JPG

> Colorful rendering of b/w Monoband TIF
> --
>
> Key: IMAGING-267
> URL: https://issues.apache.org/jira/browse/IMAGING-267
> Project: Commons Imaging
>  Issue Type: Bug
>Reporter: edgar soldin
>Priority: Major
> Attachments: ISSUE_267.JPG, mdt25a-commons.png, mdt25a-sextante.png, 
> mdt25a.tif
>
>
> see attached images.
> mdt25a.tif - the original tif
> mdt25a-commons.png - as rendered/read with Commons Imaging
> mdt25a-sextante.png - as rendered /read properly with ImageIO-Core from 
> https://github.com/jai-imageio/jai-imageio-core
> thanks!.. ede



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-267) Colorful rendering of b/w Monoband TIF

2020-09-21 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199709#comment-17199709
 ] 

Gary Lucas commented on IMAGING-267:


Thank you for posting the images. They may help to answer some issues I've 
wondered about for awhile.  I've inspected their content and I have a few 
questions for you.

First off, this TIFF appears to be geophysical data.  But there are no GeoTIFF 
tags bundled with the image. Judging from the ModelTiepointTag, it appears that 
the original image was configured using a projected coordinate system, maybe a 
UTM zone.  It might help me understand this better if there were more location 
information attached to the image. Do you have that information?

Secondly, I note that this TIFF file is not strictly-speaking an image , but is 
actually numerical data stored using the TIFF standard floating-point raster 
format.  My guess is that it's probably elevations. The ability to process TIFF 
files containing floating point data was introduced in the most recent release 
of Commons Imaging.

One challenge of floating point images is how to map the range of values to 
gray scale. In this case, the following text gives the TIFF Tags attached to 
the image:

Directory 0 Has TIFF Image Data, description: Root
 256 (0x100: ImageWidth): 601 (1 Long)
 257 (0x101: ImageLength): 410 (1 Long)
 258 (0x102: BitsPerSample): 32 (1 Short)
 259 (0x103: Compression): 1 (1 Short)
 262 (0x106: PhotometricInterpretation): 1 (Indicates "zero is black")
 277 (0x115: SamplesPerPixel): 1 (1 Short)
 278 (0x116: RowsPerStrip): 8 (1 Long)
 339 (0x153: SampleFormat): 3 (Indicates float-point format)
 33550 (0x830e: ModelPixelScaleTag): 25.0, 25.0 (2 Double)
 33922 (0x8482: ModelTiepointTag): 0.0, 0.0, 0.0, 262846.525725, 4464275.0, 0.0 
(6 Double)
 42113 (0xa481: GDALNoData): 45, 51, 50, 55, 54, 56, 46, 48 (8 Byte)

Upon inspection, I find that the values in the image range from 514 to 2410.  
Commons Imaging does have an API element called the 
custom-photometric-interpreter that lets an application specify how colors (or 
gray tones) are assigned to elevations.  So in this case, I was able to render 
the data by specifying the following lines:

{{ HashMap params = new HashMap<>(); }}
{{ PhotometricInterpreterFloat pi  = }}
{{      new PhotometricInterpreterFloat(514.0f, 2410.0f);}}
{{ params.put(TiffConstants.PARAM_KEY_CUSTOM_PHOTOMETRIC_INTERPRETER, pi);}}
{{ BufferedImage bImage = Imaging.getBufferedImage(target, params);}}
{{ ImageIO.write(bImage, "JPG", output);}}

 

I've attached the image I produced (ISSUE_267.JPG).  However, to create it, I 
had to know before hand what the range of values was.  So my application does a 
few extra steps that I did not show in the example above.  I was wondering how 
the software you used handles this issue.  Is it all automatic?  

Also, a second question I had is that the PhotometricInterpretation tag given 
with your image is 1, which means "0 is black".   In other words, the palette 
should range from the darkest shading for the lowest numerical values to the 
lightest shading for the highest values. However, in looking at your image I 
notice that the lowest value pixels are drawn in the lightest colors, which 
seems to contradict the setting in the source TIFF file.  In the image I've 
attached, the lowest value pixels are draw in the darkest colors, which is 
consistent with the specification in the TIFF image.  Is there some setting in 
the application you used that overwrites the settings from the TIFF file?

!ISSUE_267.JPG!

 

> Colorful rendering of b/w Monoband TIF
> --
>
> Key: IMAGING-267
> URL: https://issues.apache.org/jira/browse/IMAGING-267
> Project: Commons Imaging
>  Issue Type: Bug
>Reporter: edgar soldin
>Priority: Major
> Attachments: ISSUE_267.JPG, mdt25a-commons.png, mdt25a-sextante.png, 
> mdt25a.tif
>
>
> see attached images.
> mdt25a.tif - the original tif
> mdt25a-commons.png - as rendered/read with Commons Imaging
> mdt25a-sextante.png - as rendered /read properly with ImageIO-Core from 
> https://github.com/jai-imageio/jai-imageio-core
> thanks!.. ede



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-265) ArrayIndexOutOfBoundsException on reading simple GeoTIFF

2020-09-21 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199714#comment-17199714
 ] 

Gary Lucas commented on IMAGING-265:


Thanks for the information.  The original authors of the Commons Imaging 
project put an enormous amount of effort into creating the TIFF modules and I 
think they left us with a pretty good foundation.  Nowadays, it's unusual to 
find an example of a format that Commons Imaging doesn't support...  unusual, 
but perhaps not quite unusual enough :)

In the case of the samples you provided, they were created using an "odd duck" 
specification that we hadn't seen before.  I knew there was something called a 
PlanarConfiguration, but I had no idea what it was about.  So your image 
provided a good opportunity to expand our capabilities.  Of course, nobody 
likes to learn about a new bug (or a new feature) that needs to be addressed... 
But it definitely leads to more robust code when we do. 

> ArrayIndexOutOfBoundsException on reading simple GeoTIFF
> 
>
> Key: IMAGING-265
> URL: https://issues.apache.org/jira/browse/IMAGING-265
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha2
>Reporter: edgar soldin
>Assignee: Bruno P. Kinoshita
>Priority: Major
> Attachments: small_world.tif, small_world_split.jpg
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> hi,
>  
> we on the OpenJUMP project cannot open some GeoTIFFs with commons.imaging . 
> for details you may find a ticket in our bug tracker 
> [https://sourceforge.net/p/jump-pilot/bugs/498/] .
>  
> the gist is: on loading the attached file getBufferedImage() fails with this 
> stack
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 8000Caused by: 
> java.lang.ArrayIndexOutOfBoundsException: 8000 at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.interpretStrip(DataReaderStrips.java:196)
>  at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.readImageData(DataReaderStrips.java:254)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:665)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffDirectory.getTiffImage(TiffDirectory.java:254)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:469)
>  at org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1442) at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1335) at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1304) at 
> com.vividsolutions.jump.workbench.imagery.graphic.CommonsImage.initImage(CommonsImage.java:108)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-265) ArrayIndexOutOfBoundsException on reading simple GeoTIFF

2020-09-21 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199714#comment-17199714
 ] 

Gary Lucas edited comment on IMAGING-265 at 9/22/20, 12:20 AM:
---

Thanks for the information.  The original authors of the Commons Imaging 
project put an enormous amount of effort into creating the TIFF modules and I 
think they left the project with a pretty good foundation.  Nowadays, it's 
unusual to find an example of a format that Commons Imaging doesn't support...  
unusual, but perhaps not quite unusual enough :)

In the case of the samples you provided, they were created using an "odd duck" 
specification that we hadn't seen before.  I knew there was something called a 
PlanarConfiguration, but I had no idea what it was about.  So your image 
provided a good opportunity to expand our capabilities.  Of course, nobody 
likes to learn about a new bug (or a new feature) that needs to be addressed... 
But it definitely leads to more robust code when we do. 

 

Gary

P.S.  Just to make sure there's no confusion here, I am not properly speaking a 
member of the Commons Imaging project and certainly do not speak for them. I 
help out with coding occasionally.  My post above makes it sound like I know a 
lot more than I actually do...  Hope I didn't give a false sense of authority.

 


was (Author: gwlucas):
Thanks for the information.  The original authors of the Commons Imaging 
project put an enormous amount of effort into creating the TIFF modules and I 
think they left us with a pretty good foundation.  Nowadays, it's unusual to 
find an example of a format that Commons Imaging doesn't support...  unusual, 
but perhaps not quite unusual enough :)

In the case of the samples you provided, they were created using an "odd duck" 
specification that we hadn't seen before.  I knew there was something called a 
PlanarConfiguration, but I had no idea what it was about.  So your image 
provided a good opportunity to expand our capabilities.  Of course, nobody 
likes to learn about a new bug (or a new feature) that needs to be addressed... 
But it definitely leads to more robust code when we do. 

> ArrayIndexOutOfBoundsException on reading simple GeoTIFF
> 
>
> Key: IMAGING-265
> URL: https://issues.apache.org/jira/browse/IMAGING-265
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha2
>Reporter: edgar soldin
>Assignee: Bruno P. Kinoshita
>Priority: Major
> Attachments: small_world.tif, small_world_split.jpg
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> hi,
>  
> we on the OpenJUMP project cannot open some GeoTIFFs with commons.imaging . 
> for details you may find a ticket in our bug tracker 
> [https://sourceforge.net/p/jump-pilot/bugs/498/] .
>  
> the gist is: on loading the attached file getBufferedImage() fails with this 
> stack
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 8000Caused by: 
> java.lang.ArrayIndexOutOfBoundsException: 8000 at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.interpretStrip(DataReaderStrips.java:196)
>  at 
> org.apache.commons.imaging.formats.tiff.datareaders.DataReaderStrips.readImageData(DataReaderStrips.java:254)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:665)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffDirectory.getTiffImage(TiffDirectory.java:254)
>  at 
> org.apache.commons.imaging.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:469)
>  at org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1442) at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1335) at 
> org.apache.commons.imaging.Imaging.getBufferedImage(Imaging.java:1304) at 
> com.vividsolutions.jump.workbench.imagery.graphic.CommonsImage.initImage(CommonsImage.java:108)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMAGING-266) Read integer data from GeoTIFFS

2020-09-22 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-266:
---
Description: 
I recently discovered that there is a large amount of digital elevation data 
available in the form of 16-bit integer coded data in GeoTIFF files (TIFF files 
with geographic tags).  I propose to enhance the Commons Imaging API to read 
these files.  This work will be similar to the work I did for reading 
floating-point raster data under ISSUE-251.

Available data include the nearly-global coverage of one-second of arc 
elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
other sources. These products give grids of elevation data with a 30 meter cell 
spacing for most of the world's land masses. They are available at NASA 
Earthdata and Japan Space Systems websites, see 
[https://asterweb.jpl.nasa.gov/gdem.asp|https://asterweb.jpl.nasa.gov/gdem.asp.]
 . There is also a ocean bathymetry data set available in this format at 
[http://www.shadedrelief.com/blue-earth/]

This new feature will continue to expand the usefulness of the Commons Imaging 
API in accessing GeoTIFF products.

Request for Feedback

So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
elevation and ocean depth data in meters recorded as a short integer.  I 
haven't found an example of where the 32-bit integer format is used.  For now, 
I am planning on only coding the 16-bit integer variation.  Does anyone know if 
the 32-bit version is worth supporting?  My criteria for determining that would 
be based on whether there is a significant number of projects using that format 
(life is too short to chase rarely used data formats).

Currently, one of the code-analysis operations conducted by the Commons Imaging 
build process is coverage by JUnit tests.  Lacking any test data for the 32-bit 
case, I am reluctant to include it in the code base because it would mean 
putting uncovered code into the distribution.

Also, I am wondering about the best design for the access API.  The current 
TiffImageParser class has a method called getFloatingPointRasterData() that 
returns an instance of TiffRasterData.  TiffRasterData is pretty much 
hard-wired to floating point data.  I am thinking of creating a new method 
called getIntegerRasterData() that would return an instance of a new class 
called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
to combine both kinds of results into a unified class and method, but it seems 
more unwieldy than useful. 

 

 

  was:
I recently discovered that there is a large amount of digital elevation data 
available in the form of 16-bit integer coded data in GeoTIFF files (TIFF files 
with geographic tags).  I propose to enhance the Commons Imaging API to read 
these files.  This work will be similar to the work I did for reading 
floating-point raster data under ISSUE-251.

Available data include the nearly-global coverage of one-second of arc 
elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
other sources. These products give grids of elevation data with a 30 meter cell 
spacing for most of the world's land masses. They are available at NASA 
Earthdata and Japan Space Systems websites, see 
[https://asterweb.jpl.nasa.gov/gdem.asp.] There is also a ocean bathymetry data 
set available in this format at [http://www.shadedrelief.com/blue-earth/]

This new feature will continue to expand the usefulness of the Commons Imaging 
API in accessing GeoTIFF products.

Request for Feedback

So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
elevation and ocean depth data in meters recorded as a short integer.  I 
haven't found an example of where the 32-bit integer format is used.  For now, 
I am planning on only coding the 16-bit integer variation.  Does anyone know if 
the 32-bit version is worth supporting?  My criteria for determining that would 
be based on whether there is a significant number of projects using that format 
(life is too short to chase rarely used data formats).

Currently, one of the code-analysis operations conducted by the Commons Imaging 
build process is coverage by JUnit tests.  Lacking any test data for the 32-bit 
case, I am reluctant to include it in the code base because it would mean 
putting uncovered code into the distribution.

Also, I am wondering about the best design for the access API.  The current 
TiffImageParser class has a method called getFloatingPointRasterData() that 
returns an instance of TiffRasterData.  TiffRasterData is pretty much 
hard-wired to floating point data.  I am thinking of creating a new method 
called getIntegerRasterData() that would return an instance of a new class 
called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
to combine both kinds of results into a unified class and method, but it seems 
mor

[jira] [Updated] (IMAGING-266) Read integer data from GeoTIFFS

2020-09-22 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-266:
---
Description: 
I recently discovered that there is a large amount of digital elevation data 
available in the form of 16-bit integer coded data in GeoTIFF files (TIFF files 
with geographic tags).  I propose to enhance the Commons Imaging API to read 
these files.  This work will be similar to the work I did for reading 
floating-point raster data under ISSUE-251.

Available data include the nearly-global coverage of one-second of arc 
elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
other sources. These products give grids of elevation data with a 30 meter cell 
spacing for most of the world's land masses. They are available at NASA 
Earthdata and Japan Space Systems websites, see 
[https://asterweb.jpl.nasa.gov/gdem.asp|https://asterweb.jpl.nasa.gov/gdem.asp.]
 There is also a ocean bathymetry data set available in this format at 
[http://www.shadedrelief.com/blue-earth/]

This new feature will continue to expand the usefulness of the Commons Imaging 
API in accessing GeoTIFF products.

Request for Feedback

So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
elevation and ocean depth data in meters recorded as a short integer.  I 
haven't found an example of where the 32-bit integer format is used.  For now, 
I am planning on only coding the 16-bit integer variation.  Does anyone know if 
the 32-bit version is worth supporting?  My criteria for determining that would 
be based on whether there is a significant number of projects using that format 
(life is too short to chase rarely used data formats).

Currently, one of the code-analysis operations conducted by the Commons Imaging 
build process is coverage by JUnit tests.  Lacking any test data for the 32-bit 
case, I am reluctant to include it in the code base because it would mean 
putting uncovered code into the distribution.

Also, I am wondering about the best design for the access API.  The current 
TiffImageParser class has a method called getFloatingPointRasterData() that 
returns an instance of TiffRasterData.  TiffRasterData is pretty much 
hard-wired to floating point data.  I am thinking of creating a new method 
called getIntegerRasterData() that would return an instance of a new class 
called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
to combine both kinds of results into a unified class and method, but it seems 
more unwieldy than useful. 

 

 

  was:
I recently discovered that there is a large amount of digital elevation data 
available in the form of 16-bit integer coded data in GeoTIFF files (TIFF files 
with geographic tags).  I propose to enhance the Commons Imaging API to read 
these files.  This work will be similar to the work I did for reading 
floating-point raster data under ISSUE-251.

Available data include the nearly-global coverage of one-second of arc 
elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
other sources. These products give grids of elevation data with a 30 meter cell 
spacing for most of the world's land masses. They are available at NASA 
Earthdata and Japan Space Systems websites, see 
[https://asterweb.jpl.nasa.gov/gdem.asp|https://asterweb.jpl.nasa.gov/gdem.asp.]
 . There is also a ocean bathymetry data set available in this format at 
[http://www.shadedrelief.com/blue-earth/]

This new feature will continue to expand the usefulness of the Commons Imaging 
API in accessing GeoTIFF products.

Request for Feedback

So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
elevation and ocean depth data in meters recorded as a short integer.  I 
haven't found an example of where the 32-bit integer format is used.  For now, 
I am planning on only coding the 16-bit integer variation.  Does anyone know if 
the 32-bit version is worth supporting?  My criteria for determining that would 
be based on whether there is a significant number of projects using that format 
(life is too short to chase rarely used data formats).

Currently, one of the code-analysis operations conducted by the Commons Imaging 
build process is coverage by JUnit tests.  Lacking any test data for the 32-bit 
case, I am reluctant to include it in the code base because it would mean 
putting uncovered code into the distribution.

Also, I am wondering about the best design for the access API.  The current 
TiffImageParser class has a method called getFloatingPointRasterData() that 
returns an instance of TiffRasterData.  TiffRasterData is pretty much 
hard-wired to floating point data.  I am thinking of creating a new method 
called getIntegerRasterData() that would return an instance of a new class 
called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
to combine both kinds of results into a uni

[jira] [Updated] (IMAGING-266) Read integer data from GeoTIFFS

2020-09-22 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-266:
---
Description: 
I recently discovered that there is a large amount of digital elevation data 
available in the form of 16-bit integer coded data in GeoTIFF files (TIFF files 
with geographic tags).  I propose to enhance the Commons Imaging API to read 
these files.  This work will be similar to the work I did for reading 
floating-point raster data under ISSUE-251.

Available data include the nearly-global coverage of one-second of arc 
elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
other sources. These products give grids of elevation data with a 30 meter cell 
spacing for most of the world's land masses. They are available at NASA 
Earthdata and Japan Space Systems websites, see 
[https://asterweb.jpl.nasa.gov/gdem.asp|https://asterweb.jpl.nasa.gov/gdem.asp] 
There is also a ocean bathymetry data set available in this format at 
[http://www.shadedrelief.com/blue-earth/]

This new feature will continue to expand the usefulness of the Commons Imaging 
API in accessing GeoTIFF products.

Request for Feedback

So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
elevation and ocean depth data in meters recorded as a short integer.  I 
haven't found an example of where the 32-bit integer format is used.  For now, 
I am planning on only coding the 16-bit integer variation.  Does anyone know if 
the 32-bit version is worth supporting?  My criteria for determining that would 
be based on whether there is a significant number of projects using that format 
(life is too short to chase rarely used data formats).

Currently, one of the code-analysis operations conducted by the Commons Imaging 
build process is coverage by JUnit tests.  Lacking any test data for the 32-bit 
case, I am reluctant to include it in the code base because it would mean 
putting uncovered code into the distribution.

Also, I am wondering about the best design for the access API.  The current 
TiffImageParser class has a method called getFloatingPointRasterData() that 
returns an instance of TiffRasterData.  TiffRasterData is pretty much 
hard-wired to floating point data.  I am thinking of creating a new method 
called getIntegerRasterData() that would return an instance of a new class 
called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
to combine both kinds of results into a unified class and method, but it seems 
more unwieldy than useful. 

 

 

  was:
I recently discovered that there is a large amount of digital elevation data 
available in the form of 16-bit integer coded data in GeoTIFF files (TIFF files 
with geographic tags).  I propose to enhance the Commons Imaging API to read 
these files.  This work will be similar to the work I did for reading 
floating-point raster data under ISSUE-251.

Available data include the nearly-global coverage of one-second of arc 
elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
other sources. These products give grids of elevation data with a 30 meter cell 
spacing for most of the world's land masses. They are available at NASA 
Earthdata and Japan Space Systems websites, see 
[https://asterweb.jpl.nasa.gov/gdem.asp|https://asterweb.jpl.nasa.gov/gdem.asp.]
 There is also a ocean bathymetry data set available in this format at 
[http://www.shadedrelief.com/blue-earth/]

This new feature will continue to expand the usefulness of the Commons Imaging 
API in accessing GeoTIFF products.

Request for Feedback

So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
elevation and ocean depth data in meters recorded as a short integer.  I 
haven't found an example of where the 32-bit integer format is used.  For now, 
I am planning on only coding the 16-bit integer variation.  Does anyone know if 
the 32-bit version is worth supporting?  My criteria for determining that would 
be based on whether there is a significant number of projects using that format 
(life is too short to chase rarely used data formats).

Currently, one of the code-analysis operations conducted by the Commons Imaging 
build process is coverage by JUnit tests.  Lacking any test data for the 32-bit 
case, I am reluctant to include it in the code base because it would mean 
putting uncovered code into the distribution.

Also, I am wondering about the best design for the access API.  The current 
TiffImageParser class has a method called getFloatingPointRasterData() that 
returns an instance of TiffRasterData.  TiffRasterData is pretty much 
hard-wired to floating point data.  I am thinking of creating a new method 
called getIntegerRasterData() that would return an instance of a new class 
called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
to combine both kinds of results into a unifie

[jira] [Commented] (IMAGING-266) Read integer data from GeoTIFFS

2020-09-22 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200369#comment-17200369
 ] 

Gary Lucas commented on IMAGING-266:


I researched this feature request and what I've found is basically a mix of 
good news and bad news. The good news is that the ImageBuilder API can be used 
as a way of storing the integer data.  I looked into the idea of just storing 
the integer data in the BufferedImage output from the file, but it had too much 
of an improvisational flavor for my liking.  It would also mean performing an 
extra transcription of the data. During processing, the ImageBuilder is used as 
a temporary container for pixel information while it is being extracted from 
the TIFF file and collected for output. So my plan is to just provide an API 
element that exposes the ImageBuilder instance to the application.  This 
addition can be accomplished with minimal changes to the existing code base.

The bad news is that the data reader classes will need an enhancement. The 
original authors of Commons Imaging made a design choice which, in retrospect, 
was unlucky.  The TIFF specification allows data, particularly grayscale data, 
to use a variable number of bits to encode an image.  The elevation data that 
was the inspiration for this feature request is based on 16 bits per sample. 
But the Commons Imaging data readers always convert samples to single bytes.  
Most of the time, this doesn't matter. In some case, such as RGB images, the 
data is already in the form of one byte per sample (3 samples per pixel).

Even though it is common to think of RGB values as consisting of three bytes 
(three "samples", one for each color), there is nothing fundamental about the 
specification of 8 bits per sample.  For example, the current GOES-R generation 
of weather satellites use 12-bit imaging channels to give better discrimination 
of cloud and ground radiance values.  Older satellite images frequently used 10 
bits. However, to simplify the code for its various photometric interpreters 
(the classes that map binary data to pixel colors for rendering images), the 
original implementation built in an adjustment that always converts sample 
values to bytes before passing them on to the photometric interpreters.  If you 
look at the DataReaderStrips and DataReaderTiles classes, you'll see methods 
called getSamplesAsBytes() that do this operation.

For elevation products, this conversion has the consequence if throwing away 
most of the meaningful information in the source data.  

Anyway, addressing this issue requires one of two approaches.  One idea is to 
get rid of the sample conversion and upgrade the photometric interpreters to 
handle the data correctly.  But this change would mean changing multiple 
photometric interpreters, some of which (CieLab, LogLuv, YcbCR) are quite 
complicated. The alternative is to implement a special block of code and a 
special processing rule. This is the approach I propose to implement.

I propose the following: The Commons Imaging API permits an application to pass 
in a custom photometric interpreter. The new feature will implement a special 
processing rule so that, when the application passes in a customer photometric 
interpreter, the data readers will retain the full precision for the samples. 
Eight bit samples will stay 8 bits. Four bit samples will stay 4. And the 16 
bit samples in the elevation products will stay 16. Presumably, an application 
that takes the trouble to supply a custom photometric interpreter will want to 
handle the data exactly as it appears in its source TIFF files.  And, since the 
ability to specify custom photometric interpreters is a relatively new feature 
(only a few months old), it is unlikely that this change will interfere with 
any existing code.

Please let me know if you have any suggestions or insights that I may have 
missed.

Thanks



> Read integer data  from GeoTIFFS
> 
>
> Key: IMAGING-266
> URL: https://issues.apache.org/jira/browse/IMAGING-266
> Project: Commons Imaging
>  Issue Type: New Feature
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha3
>Reporter: Gary Lucas
>Priority: Major
>
> I recently discovered that there is a large amount of digital elevation data 
> available in the form of 16-bit integer coded data in GeoTIFF files (TIFF 
> files with geographic tags).  I propose to enhance the Commons Imaging API to 
> read these files.  This work will be similar to the work I did for reading 
> floating-point raster data under ISSUE-251.
> Available data include the nearly-global coverage of one-second of arc 
> elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
> other sources. These products give grids of elevation data with a 30 meter 
> cell spac

[jira] [Updated] (IMAGING-267) Colorful rendering of b/w Monoband TIF

2020-09-24 Thread Gary Lucas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMAGING-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Lucas updated IMAGING-267:
---
Attachment: RenderElevationTiff.java

> Colorful rendering of b/w Monoband TIF
> --
>
> Key: IMAGING-267
> URL: https://issues.apache.org/jira/browse/IMAGING-267
> Project: Commons Imaging
>  Issue Type: Bug
>Reporter: edgar soldin
>Priority: Major
> Attachments: ISSUE_267.JPG, RenderElevationTiff.java, 
> mdt25a-commons.png, mdt25a-sextante.png, mdt25a.tfw, mdt25a.tif, 
> mdt25a.tif.aux.xml
>
>
> see attached images.
> mdt25a.tif - the original tif
> mdt25a-commons.png - as rendered/read with Commons Imaging
> mdt25a-sextante.png - as rendered /read properly with ImageIO-Core from 
> https://github.com/jai-imageio/jai-imageio-core
> thanks!.. ede



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-267) Colorful rendering of b/w Monoband TIF

2020-09-24 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201408#comment-17201408
 ] 

Gary Lucas commented on IMAGING-267:


My analysis here is that the Commons Imaging package already includes an API to 
render the source file.   I am recommending that this JIRA issue be marked as 
"resolved".

The process of rendering files that contain floating-point data is a little bit 
more involved than ordinary image files. To illustrate how it may be 
accomplished, I have uploaded a simple example application called 
RenderElevationTiff.java. This application demonstrates the various API 
elements that support this kind of rendering. There are also related example 
applications in the Commons Imaging source distribution, but this one is 
optimized for elevation data such as that supplied by the test file that was 
provided by Edgar Soldin.

I'm not sure that rendering this kind of data could be completely automated 
because I think that a rendering application would have to have domain-specific 
information. For example, some land-elevation GeoTIFF files include "no-data" 
points specified as a large magnitude negative number. If an application were 
to select gray tones exclusively on the bases of range-of-values, the no-data 
points would be included in that range. [^RenderElevationTiff.java] 

> Colorful rendering of b/w Monoband TIF
> --
>
> Key: IMAGING-267
> URL: https://issues.apache.org/jira/browse/IMAGING-267
> Project: Commons Imaging
>  Issue Type: Bug
>Reporter: edgar soldin
>Priority: Major
> Attachments: ISSUE_267.JPG, RenderElevationTiff.java, 
> mdt25a-commons.png, mdt25a-sextante.png, mdt25a.tfw, mdt25a.tif, 
> mdt25a.tif.aux.xml
>
>
> see attached images.
> mdt25a.tif - the original tif
> mdt25a-commons.png - as rendered/read with Commons Imaging
> mdt25a-sextante.png - as rendered /read properly with ImageIO-Core from 
> https://github.com/jai-imageio/jai-imageio-core
> thanks!.. ede



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-267) Colorful rendering of b/w Monoband TIF

2020-09-24 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201408#comment-17201408
 ] 

Gary Lucas edited comment on IMAGING-267 at 9/24/20, 9:44 AM:
--

My analysis here is that the Commons Imaging package already includes an API to 
render the source file. I am recommending that this JIRA issue be marked as 
"resolved".

The process of rendering files that contain floating-point data is a little bit 
more involved than ordinary image files. To illustrate how it may be 
accomplished, I have uploaded a simple example application called 
RenderElevationTiff.java. This application demonstrates the various API 
elements that support this kind of rendering. There are also related example 
applications in the Commons Imaging source distribution, but this one is 
optimized for elevation data such as that supplied by the test file that was 
provided by Edgar Soldin.

I'm not sure that rendering this kind of data could be completely automated 
because I think that a rendering application would have to have domain-specific 
information. For example, some land-elevation GeoTIFF files include "no-data" 
points specified as a large magnitude negative number. If an application were 
to select gray tones exclusively on the bases of range-of-values, the no-data 
points would be included in that range.

 

[^RenderElevationTiff.java]


was (Author: gwlucas):
My analysis here is that the Commons Imaging package already includes an API to 
render the source file.   I am recommending that this JIRA issue be marked as 
"resolved".

The process of rendering files that contain floating-point data is a little bit 
more involved than ordinary image files. To illustrate how it may be 
accomplished, I have uploaded a simple example application called 
RenderElevationTiff.java. This application demonstrates the various API 
elements that support this kind of rendering. There are also related example 
applications in the Commons Imaging source distribution, but this one is 
optimized for elevation data such as that supplied by the test file that was 
provided by Edgar Soldin.

I'm not sure that rendering this kind of data could be completely automated 
because I think that a rendering application would have to have domain-specific 
information. For example, some land-elevation GeoTIFF files include "no-data" 
points specified as a large magnitude negative number. If an application were 
to select gray tones exclusively on the bases of range-of-values, the no-data 
points would be included in that range. [^RenderElevationTiff.java] 

> Colorful rendering of b/w Monoband TIF
> --
>
> Key: IMAGING-267
> URL: https://issues.apache.org/jira/browse/IMAGING-267
> Project: Commons Imaging
>  Issue Type: Bug
>Reporter: edgar soldin
>Priority: Major
> Attachments: ISSUE_267.JPG, RenderElevationTiff.java, 
> mdt25a-commons.png, mdt25a-sextante.png, mdt25a.tfw, mdt25a.tif, 
> mdt25a.tif.aux.xml
>
>
> see attached images.
> mdt25a.tif - the original tif
> mdt25a-commons.png - as rendered/read with Commons Imaging
> mdt25a-sextante.png - as rendered /read properly with ImageIO-Core from 
> https://github.com/jai-imageio/jai-imageio-core
> thanks!.. ede



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-201) Significant change in color of output image of tiff file

2020-09-27 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17202815#comment-17202815
 ] 

Gary Lucas commented on IMAGING-201:


I inspected the tags for this image and found that it stored using the CMYK 
photometric interpreter.  So the problem may be due to an issue in CMYK.

 

The CMYK model is based on the ink colors used in 4-color printing. To convert 
the image to a Java buffered image, there is a CMYK-to-RGB conversion applied. 
It is possible that there is something wrong with the conversion logic. 

 

I did use the debugger to step through some of the logic related to 
decompressing this image. At first glance, that appears to be working properly.

 

> Significant change in color of output image of tiff file
> 
>
> Key: IMAGING-201
> URL: https://issues.apache.org/jira/browse/IMAGING-201
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha1
>Reporter: Praful Vaishnav
>Priority: Major
> Attachments: SS_result.png, SS_source.png
>
>
> Hii.. I am reading tiff image file and using this library to convert it to 
> PNG format. Resultant png file has significant change in color wrt its 
> original tiff file. Below is the code :
> {code:java}
> File file = new File("C:\\original.tiff");
> BufferedImage bi = Imaging.getBufferedImage(file);
> File dstFile = new File("C:\\result.png");
> Imaging.writeImage(bi, dstFile, ImageFormats.PNG, null);
> {code}
> Reason could be :
> original image is 32 bit depth and result image is 24 bit depth. Is there any 
> issue with 32 bit depth image file.
> Link for source tiff file - 
> https://www.dropbox.com/s/kgqgdcygelkor8b/original.tiff?dl=0
> Attachements:
> SS_source.png - Screenshot of original tiff file
> SS_result.png - Screenshot of result png file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMAGING-201) Significant change in color of output image of tiff file

2020-10-02 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206093#comment-17206093
 ] 

Gary Lucas commented on IMAGING-201:


I investigated this issue further and it appears that the CMYK to RGB 
conversions are implemented correctly in the Commons Imaging library. The 
conversion code takes a round-about approach, but a little algebra shows that 
they are equivalent to the well-known formulas.  I did find that the conversion 
maps CMYK to _linear_ RGB, but that the BufferedImage class used by Java wants 
sRGB. Adding the linear-to-sRGB conversion to the code addressed the incorrect 
green tone on the model's jacket, but  results in an overall washed out 
appearance in the image (the poor guy is so pale that he looks like he has a 
bad case of the flu).

I believe that the key to correctly rendering this image is to use the ICC 
Color Profile that is bundled with the image using tag 34675 (0x8773: 
InterColorProfile). There is some support for this in the Commons Imaging color 
package, but it is not integrated into the TIFF code.  

I'm afraid that I am not especially familiar with color models and profiles and 
have taken this issue as far as I can for now. I hope these notes might help 
somebody else get started on a solution.

 

Here's the code I added to my local version of the CMYK photometric interpreter 
to perform the CMYK to RGB conversion:

int makeRGB(int[] samples) {
   final double c = samples[0] / 255.0;
    final double m = samples[1] / 255.0;
    final double y = samples[2] / 255.0;
    final double k = samples[3] / 255.0;

   int r = (int) (255 * linearToSRGB((1 - c) * (1 - k)) + 0.5);
    int g = (int) (255 * linearToSRGB((1 - m) * (1 - k)) + 0.5);
    int b = (int) (255 * linearToSRGB((1 - y) * (1 - k)) + 0.5);

   return 0xff00 | (r << 16) | (g << 8) | b;
 }

double linearToSRGB(double a) {
   if (a <= 0.0031308) {
      return a * 12.92;
   } else {
      return 1.055 * Math.pow(a, 1.0 / 2.4) - 0.055;
   }
 }

 

 

 

> Significant change in color of output image of tiff file
> 
>
> Key: IMAGING-201
> URL: https://issues.apache.org/jira/browse/IMAGING-201
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha1
>Reporter: Praful Vaishnav
>Priority: Major
> Attachments: SS_result.png, SS_source.png
>
>
> Hii.. I am reading tiff image file and using this library to convert it to 
> PNG format. Resultant png file has significant change in color wrt its 
> original tiff file. Below is the code :
> {code:java}
> File file = new File("C:\\original.tiff");
> BufferedImage bi = Imaging.getBufferedImage(file);
> File dstFile = new File("C:\\result.png");
> Imaging.writeImage(bi, dstFile, ImageFormats.PNG, null);
> {code}
> Reason could be :
> original image is 32 bit depth and result image is 24 bit depth. Is there any 
> issue with 32 bit depth image file.
> Link for source tiff file - 
> https://www.dropbox.com/s/kgqgdcygelkor8b/original.tiff?dl=0
> Attachements:
> SS_source.png - Screenshot of original tiff file
> SS_result.png - Screenshot of result png file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMAGING-201) Significant change in color of output image of tiff file

2020-10-02 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206093#comment-17206093
 ] 

Gary Lucas edited comment on IMAGING-201 at 10/2/20, 10:55 AM:
---

I investigated this issue further and it appears that the CMYK to RGB 
conversions are implemented correctly in the Commons Imaging library. The 
conversion code takes a round-about approach, but a little algebra shows that 
they are equivalent to the well-known formulas.  I did find that the conversion 
maps CMYK to _linear_ RGB, but that the BufferedImage class used by Java wants 
sRGB. Adding the linear-to-sRGB conversion to the code addressed the incorrect 
green tone on the model's jacket, but  results in an overall washed out 
appearance in the image (the poor guy is so pale that he looks like he has a 
bad case of the flu).

I believe that the key to correctly rendering this image is to use the ICC 
Color Profile that is bundled with the image using tag 34675 (0x8773: 
InterColorProfile). There is some support for this in the Commons Imaging color 
package, but it is not integrated into the TIFF code.  

I'm afraid that I am not especially familiar with color models and profiles and 
have taken this issue as far as I can for now. I hope these notes might help 
somebody else get started on a solution.

 

Here's the code I added to my local version of the CMYK photometric interpreter 
to perform the CMYK to RGB conversion:

 
{code:java}
      int makeRGB(int[] samples) {
final double c = samples[0] / 255.0;
final double m = samples[1] / 255.0;
final double y = samples[2] / 255.0;
final double k = samples[3] / 255.0;

int r = (int) (255 * linearToSRGB((1 - c) * (1 - k)) + 0.5);
int g = (int) (255 * linearToSRGB((1 - m) * (1 - k)) + 0.5);
int b = (int) (255 * linearToSRGB((1 - y) * (1 - k)) + 0.5);

return 0xff00 | (r << 16) | (g << 8) | b;
}

double linearToSRGB(double a) {
if (a <= 0.0031308) {
return a * 12.92;
} else {
return 1.055 * Math.pow(a, 1.0 / 2.4) - 0.055;
}
}   
{code}


 

 


was (Author: gwlucas):
I investigated this issue further and it appears that the CMYK to RGB 
conversions are implemented correctly in the Commons Imaging library. The 
conversion code takes a round-about approach, but a little algebra shows that 
they are equivalent to the well-known formulas.  I did find that the conversion 
maps CMYK to _linear_ RGB, but that the BufferedImage class used by Java wants 
sRGB. Adding the linear-to-sRGB conversion to the code addressed the incorrect 
green tone on the model's jacket, but  results in an overall washed out 
appearance in the image (the poor guy is so pale that he looks like he has a 
bad case of the flu).

I believe that the key to correctly rendering this image is to use the ICC 
Color Profile that is bundled with the image using tag 34675 (0x8773: 
InterColorProfile). There is some support for this in the Commons Imaging color 
package, but it is not integrated into the TIFF code.  

I'm afraid that I am not especially familiar with color models and profiles and 
have taken this issue as far as I can for now. I hope these notes might help 
somebody else get started on a solution.

 

Here's the code I added to my local version of the CMYK photometric interpreter 
to perform the CMYK to RGB conversion:

int makeRGB(int[] samples) {
   final double c = samples[0] / 255.0;
    final double m = samples[1] / 255.0;
    final double y = samples[2] / 255.0;
    final double k = samples[3] / 255.0;

   int r = (int) (255 * linearToSRGB((1 - c) * (1 - k)) + 0.5);
    int g = (int) (255 * linearToSRGB((1 - m) * (1 - k)) + 0.5);
    int b = (int) (255 * linearToSRGB((1 - y) * (1 - k)) + 0.5);

   return 0xff00 | (r << 16) | (g << 8) | b;
 }

double linearToSRGB(double a) {
   if (a <= 0.0031308) {
      return a * 12.92;
   } else {
      return 1.055 * Math.pow(a, 1.0 / 2.4) - 0.055;
   }
 }

 

 

 

> Significant change in color of output image of tiff file
> 
>
> Key: IMAGING-201
> URL: https://issues.apache.org/jira/browse/IMAGING-201
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0-alpha1
>Reporter: Praful Vaishnav
>Priority: Major
> Attachments: SS_result.png, SS_source.png
>
>
> Hii.. I am reading tiff image file and using this library to convert it to 
> PNG format. Resultant png file has significant change in color wrt its 
> original tiff file. Below is the code :
> {code:java}
> File file = new File("C:\\original.tiff");
> BufferedImage bi = Imaging.getBufferedImage(file);
> File dstFile = new File("C:\\result.png");
> Imaging.writeImage(

  1   2   3   >