[jira] [Resolved] (IMAGING-33) Incorrect code for tiled TIFF files applyPredictor method

2012-05-08 Thread Damjan Jovanovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMAGING-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damjan Jovanovic resolved IMAGING-33.
-

   Resolution: Fixed
Fix Version/s: 1.0

Thank you for your bug report.

Patch and test images committed to SVN, resolving fixed.


> Incorrect code for tiled TIFF files applyPredictor method 
> --
>
> Key: IMAGING-33
> URL: https://issues.apache.org/jira/browse/IMAGING-33
> Project: Apache Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Reporter: Gary Lucas
>Priority: Minor
> Fix For: 1.0
>
>
> I believe that the DataReaderTiled class used for reading tiled TIFF files 
> invokes the applyPredictor method with incorrect arguments and will not be 
> able to properly decode TIFF files that use predictors.  The bug was found 
> during a code inspection. Unfortunately, I do not have any samples of data in 
> this format (there are none in the Apache Imaging test files) and cannot 
> verify that this is the case.
> Some Background
> TIFF files are often used to store images in technical applications where 
> data must be faithfully preserved, so lossy compression methods like JPEG are 
> inappropriate and non-lossy method like LZW must be used. However, continuous 
> tone images like satellite images or photographs often do not compress well 
> since there is little apparent redundancy in the data. To improve the 
> redundancy of the data, TIFF uses a simple predictor.  The first pixel (gray 
> tone or RGB value) in a tile is stored as a literal value.  All subsequent 
> pixels are stored as differences.  To see how this works, imagine a 
> monochrome picture where the gray tones gradually fade from white to black at 
> a steady rate. Although no particular data value is ever repeated (so there 
> is little apparent redundancy in the source data) the delta values remain 
> constant (so a set of delta values will compress very well). When transformed 
> in this matter, certain images show substantial improvements in compression 
> ratio.
> The Probem
> The DataReaderTiff class uses a method called applyPredictor that takes an 
> argument telling it whether the sample passed in is the first value, and 
> should be treated as a literal, or whether it is a subsequent value and 
> should be treated as a delta.   Unfortunately, the parameter it uses is the x 
> coordinate of the pixel to be decoded.  While this approach works for TIFF 
> strip files (where the first pixel always has a coordinate of zero), it does 
> not work for tiles where the first pixel in the tile could fall anywhere in 
> the image. 
>   
> The Fix
> While we could simply fix the argument passed into the predictor, there is a 
> better solution. The predictor performs an if/then operation on the input 
> parameter to find out if it is the first sample in the tile. Once it unpacks 
> a sample, it retains it as the "last" value so that it may be added to the 
> next delta value.  Why not simply get rid of the if/then operation and just 
> ensure that the last value gets zeroed out before beginning the processing of 
> a strip or tile.  This would save an if/then operation and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (IMAGING-33) Incorrect code for tiled TIFF files applyPredictor method

2012-05-08 Thread Damjan Jovanovic (JIRA)

[ 
https://issues.apache.org/jira/browse/IMAGING-33?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271140#comment-13271140
 ] 

Damjan Jovanovic commented on IMAGING-33:
-

You can use:
tiffcp -t -c lzw:2 in.tif out.tif
to generate a suitable tiled predicted image.


> Incorrect code for tiled TIFF files applyPredictor method 
> --
>
> Key: IMAGING-33
> URL: https://issues.apache.org/jira/browse/IMAGING-33
> Project: Apache Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Reporter: Gary Lucas
>Priority: Minor
>
> I believe that the DataReaderTiled class used for reading tiled TIFF files 
> invokes the applyPredictor method with incorrect arguments and will not be 
> able to properly decode TIFF files that use predictors.  The bug was found 
> during a code inspection. Unfortunately, I do not have any samples of data in 
> this format (there are none in the Apache Imaging test files) and cannot 
> verify that this is the case.
> Some Background
> TIFF files are often used to store images in technical applications where 
> data must be faithfully preserved, so lossy compression methods like JPEG are 
> inappropriate and non-lossy method like LZW must be used. However, continuous 
> tone images like satellite images or photographs often do not compress well 
> since there is little apparent redundancy in the data. To improve the 
> redundancy of the data, TIFF uses a simple predictor.  The first pixel (gray 
> tone or RGB value) in a tile is stored as a literal value.  All subsequent 
> pixels are stored as differences.  To see how this works, imagine a 
> monochrome picture where the gray tones gradually fade from white to black at 
> a steady rate. Although no particular data value is ever repeated (so there 
> is little apparent redundancy in the source data) the delta values remain 
> constant (so a set of delta values will compress very well). When transformed 
> in this matter, certain images show substantial improvements in compression 
> ratio.
> The Probem
> The DataReaderTiff class uses a method called applyPredictor that takes an 
> argument telling it whether the sample passed in is the first value, and 
> should be treated as a literal, or whether it is a subsequent value and 
> should be treated as a delta.   Unfortunately, the parameter it uses is the x 
> coordinate of the pixel to be decoded.  While this approach works for TIFF 
> strip files (where the first pixel always has a coordinate of zero), it does 
> not work for tiles where the first pixel in the tile could fall anywhere in 
> the image. 
>   
> The Fix
> While we could simply fix the argument passed into the predictor, there is a 
> better solution. The predictor performs an if/then operation on the input 
> parameter to find out if it is the first sample in the tile. Once it unpacks 
> a sample, it retains it as the "last" value so that it may be added to the 
> next delta value.  Why not simply get rid of the if/then operation and just 
> ensure that the last value gets zeroed out before beginning the processing of 
> a strip or tile.  This would save an if/then operation and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (IMAGING-70) Reduce memory use of TIFF readers

2012-05-08 Thread Damjan Jovanovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMAGING-70?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damjan Jovanovic resolved IMAGING-70.
-

   Resolution: Fixed
Fix Version/s: 1.0

Thank you! Patch applied to SVN. Resolving fixed.


> Reduce memory use of TIFF readers
> -
>
> Key: IMAGING-70
> URL: https://issues.apache.org/jira/browse/IMAGING-70
> Project: Apache Commons Imaging
>  Issue Type: Improvement
>  Components: Format: TIFF
>Reporter: Gary Lucas
> Fix For: 1.0
>
> Attachments: Tracker_76_Test_5_May_2012.patch
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> This Tracker Item proposes changes to the TIFF file readers to address memory 
> issues when reading very large images from TIFF files.  The TIFF format is 
> used extensively in technical applications such as aerial photographs, 
> satellite images, and digital raster maps which feature very large image 
> sizes.  For example, the public-domain Natural Earth Data set features raster 
> files sized 21,600 by 10,800 pixels (222.5 megapixels).   Although this 
> example is unusually large, image sizes of 25 to 100 megapixels are common 
> for such applications.
> Unfortunately, when Sanselan reads a TIFF image, it consumes nearly twice as 
> much memory as is necessary.  The reader operates in two stages. First, it 
> reads the entire source file into memory then it builds the output image, 
> also in memory.   In the example file mentioned above, the source data runs 
> from 83.19 to 373 megabytes (depending on compression).   Thus Sanselan would 
> require a minimum of 83.19+4*222.5 = 985 megabytes to produce an image for 
> one of these files (allowing 4 bytes per pixel in the output BufferedImage)
> Fortunately, TIFF files are organized so that they can be read a piece at a 
> time.  TIFF files are divided into either strips or tiles and, if data 
> compression is used, each piece is compressed individually.  Thus each 
> individual piece has no dependency on the other. 
> This item proposes to implement two changes:
> 1)  Allow the TIFF data reader to read the files one piece at a time while 
> constructing the buffered image.  Thus the memory use for reading would be no 
> larger than the piece size.  This would be an internal change, so the 
> external appearance of the Sanselan getBufferedImage methods would not change.
> 2) Provide new API elements that permit applications to read the strips or 
> tiles from TIFF files individually. This change would support 
> applications that needed to access very large TIFF files without committing 
> the memory to store a BufferedImage for the entire file (a 222.5 megapixel 
> image requires 890 megabytes, which is a lot even by contemporary standards).
> There is one minor issue in this implementation that is easily addressed.  
> Sanselan reads images from ByteSources that can be either random-access files 
> or sequential-access input streams.  In the case of sequential-input streams, 
> it may be hard to perform a partial read on a TIFF directory.  In such a 
> case, the TIFF access routines might have to resort to reading the entire 
> source data into memory as it currently does.   This would simply be a 
> limitation of the implementation.
> There is one issue that may make this change a bit problematic.  The TIFF 
> processors depend on accessing a class called TiffDataElement that contains a 
> public array of bytes called "data".   The most expeditious way of 
> implementing the enchancement is to make this element private and add an 
> accessor that either returns the data from internal memory or else loads it 
> on-demand.  Unfortunately, because the data element is scoped to public, 
> there is a chance that some existing applications are using it directly.   In 
> hindsight, it is clear that scoping this element as public was a mistake, but 
> it may be too late to fix it.  So care will be required to ensure that 
> compatibility remains.   The most likely solution seems to be to implement a 
> new class for passing raw data from the source TIFF files to the DataReader 
> implementations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MATH-753) Better implementation for the gamma distribution density function

2012-05-08 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sébastien Brisard updated MATH-753:
---

Attachment: lanczos.patch

{{lanczos.patch}} is the patch discussed in the the {{dev}} mailing-list:
{quote}
As I initially feared, what was proposed in the JIRA ticket leads to high 
floating point errors. I adapted a method proposed in 
[BOOST|http://www.boost.org/doc/libs/1_35_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/sf_gamma/igamma.html]
 (formula (15))
with acceptable errors. Meanwhile, I've also managed to improve the accuracy of 
the computation of the density for the range of parameters where the previous 
implementation was already working: in this range, the accuracy _was_ about 300 
ulps, and is now 1-2 ulps! {color:red}Note: I might have been too optimistic, 
here. There is a significant improvement, though{color}. I think this 
improvement is worth implementing.

The downside is that I need to expose the Lanczos implementation internally 
used by {{o.a.c.m3.special.Gamma}}. This approximation is so standard that I 
don't see it as a problem. I don't think that it reveals too much of the Gamma 
internals, since the javadoc of {{Gamma.logGamma}} states that it uses this 
approximation. So what I
propose is the addition of two methods in {{Gamma}}:
* {{double getLanczosG()}}: returns the {{g}} constant
* {{double lanczos(double)}}: returns the value of the Lanczos sum.
{quote}

> Better implementation for the gamma distribution density function
> -
>
> Key: MATH-753
> URL: https://issues.apache.org/jira/browse/MATH-753
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0, 3.1
>Reporter: Francesco Strino
>Assignee: Sébastien Brisard
>Priority: Minor
>  Labels: improvement, stability
> Fix For: 3.1
>
> Attachments: lanczos.patch
>
>
> The way the density of the gamma distribution function is estimated can be 
> improved.
> It's much more stable to calculate first the log of the density and then 
> exponentiate, otherwise the function returns NaN for high values of the 
> parameters alpha and beta. 
> It would be sufficient to change the public double density(double x) function 
> at line 204 in the file 
> org.apache.commons.math.distribution.GammaDistributionImpl as follows:
> return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - 
> Gamma.logGamma(alpha)); 
> In order to improve performance, log(beta) and Gamma.logGamma(alpha) could 
> also be precomputed and stored during initialization.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONFIGURATION-496) Excessive Synchronization AbstractFileConfiguration

2012-05-08 Thread Ralph Goers (JIRA)

[ 
https://issues.apache.org/jira/browse/CONFIGURATION-496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270831#comment-13270831
 ] 

Ralph Goers commented on CONFIGURATION-496:
---

There is no simple answer to your question. Commons Configuration used to not 
be thread-safe. Unfortunately, I discovered that after our application was 
already in production. My use case deals with DefaultConfigurationBuilder 
constructing DynamicCombinedConfigurations and XMLConfigurations all with file 
reloading.  Due to the way CombinedConfigurations work every time a 
configuration file was reloaded other configurations that shared the common 
file were corrupted.

Unfortunately, you can't check to see if a reload is required without holding 
the lock or multiple threads will end up performing the reload. 

All of this was put into place while the minimum version was Java 1.4. Now that 
it is Java 5 much of this code can be redone to take advantage of 
java.util.concurrent. I just haven't gotten to it. However, I've had reports of 
similar issues from my users so I plan to  address these issues in the very 
near future.

> Excessive Synchronization AbstractFileConfiguration
> ---
>
> Key: CONFIGURATION-496
> URL: https://issues.apache.org/jira/browse/CONFIGURATION-496
> Project: Commons Configuration
>  Issue Type: Bug
>  Components: File reloading
>Affects Versions: 1.6, 1.7, 1.8
>Reporter: Tim Canavan
>
> We are having a problem with commons configuration 1.6 
> AbstractFileConfiguration 
> During a stress test we are seeing that we have wait locks against this 
> method causing this method not to complete for up to one second.
> We are using the FileReloadStrategy delegate which makes a check on the file 
> system when now + interval is greater than the compare time.
> Why can't we make this check before the synchronized block thus increasing 
> throughput. I have noticed in 1.8 that the caller to this method is 
> synchronized. This seems like excessive synchronization. Any ideas how to 
> solve this. 
> {code}
> public void reload()
> {
> synchronized (reloadLock)
> {
> if (noReload == 0)
> {
> try
> {
> enterNoReload(); // avoid reentrant calls
> if (strategy.reloadingRequired())
> {
> if (getLogger().isInfoEnabled())
> {
> getLogger().info("Reloading configuration. URL is 
> " + getURL());
> }
> fireEvent(EVENT_RELOAD, null, getURL(), true);
> setDetailEvents(false);
> boolean autoSaveBak = this.isAutoSave(); // save the 
> current state
> this.setAutoSave(false); // deactivate autoSave to 
> prevent information loss
> try
> {
> clear();
> load();
> }
> finally
> {
> this.setAutoSave(autoSaveBak); // set autoSave to 
> previous value
> setDetailEvents(true);
> }
> fireEvent(EVENT_RELOAD, null, getURL(), false);
> // notify the strategy
> strategy.reloadingPerformed();
> }
> }
> catch (Exception e)
> {
> fireError(EVENT_RELOAD, null, null, e);
> // todo rollback the changes if the file can't be reloaded
> }
> finally
> {
> exitNoReload();
> }
> }
> }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONFIGURATION-496) Excessive Synchronization AbstractFileConfiguration

2012-05-08 Thread Tim Canavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONFIGURATION-496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Canavan updated CONFIGURATION-496:
--

Description: 
We are having a problem with commons configuration 1.6 
AbstractFileConfiguration 

During a stress test we are seeing that we have wait locks against this method 
causing this method not to complete for up to one second.

We are using the FileReloadStrategy delegate which makes a check on the file 
system when now + interval is greater than the compare time.

Why can't we make this check before the synchronized block thus increasing 
throughput. I have noticed in 1.8 that the caller to this method is 
synchronized. This seems like excessive synchronization. Any ideas how to solve 
this. 

{code}
public void reload()
{
synchronized (reloadLock)
{
if (noReload == 0)
{
try
{
enterNoReload(); // avoid reentrant calls

if (strategy.reloadingRequired())
{
if (getLogger().isInfoEnabled())
{
getLogger().info("Reloading configuration. URL is " 
+ getURL());
}
fireEvent(EVENT_RELOAD, null, getURL(), true);
setDetailEvents(false);
boolean autoSaveBak = this.isAutoSave(); // save the 
current state
this.setAutoSave(false); // deactivate autoSave to 
prevent information loss
try
{
clear();
load();
}
finally
{
this.setAutoSave(autoSaveBak); // set autoSave to 
previous value
setDetailEvents(true);
}
fireEvent(EVENT_RELOAD, null, getURL(), false);

// notify the strategy
strategy.reloadingPerformed();
}
}
catch (Exception e)
{
fireError(EVENT_RELOAD, null, null, e);
// todo rollback the changes if the file can't be reloaded
}
finally
{
exitNoReload();
}
}
}
}

{code}

  was:
We are having a problem with commons configuration 1.6 
AbstractFileConfiguration 

During a stress test we are seeing that we have wait locks against this method 
causing this method not to complete for up to one second.

We are using the FileReloadStrategy delegate which makes a check on the file 
system when now + interval is greater than the compare time.

Why can't we make this check before the synchronized block thus increasing 
throughput. I have noticed in 1.8 that the caller to this method is 
synchronized. This seems like excessive synchronization. Any ideas how to solve 
this. 

public void reload()
{
synchronized (reloadLock)
{
if (noReload == 0)
{
try
{
enterNoReload(); // avoid reentrant calls

if (strategy.reloadingRequired())
{
if (getLogger().isInfoEnabled())
{
getLogger().info("Reloading configuration. URL is " 
+ getURL());
}
fireEvent(EVENT_RELOAD, null, getURL(), true);
setDetailEvents(false);
boolean autoSaveBak = this.isAutoSave(); // save the 
current state
this.setAutoSave(false); // deactivate autoSave to 
prevent information loss
try
{
clear();
load();
}
finally
{
this.setAutoSave(autoSaveBak); // set autoSave to 
previous value
setDetailEvents(true);
}
fireEvent(EVENT_RELOAD, null, getURL(), false);

// notify the strategy
strategy.reloadingPerformed();
}
}
catch (Exception e)
{
fireError(EVENT_RELOAD, null, null, e);
// todo rollback the changes if the file can't be reloaded
}
finally
{
exitNoReload();
}
}
}
}


> Excessive Synchronization AbstractFileConfiguration
> 

[jira] [Created] (CONFIGURATION-496) Excessive Synchronization AbstractFileConfiguration

2012-05-08 Thread Tim Canavan (JIRA)
Tim Canavan created CONFIGURATION-496:
-

 Summary: Excessive Synchronization AbstractFileConfiguration
 Key: CONFIGURATION-496
 URL: https://issues.apache.org/jira/browse/CONFIGURATION-496
 Project: Commons Configuration
  Issue Type: Bug
  Components: File reloading
Affects Versions: 1.8, 1.7, 1.6
Reporter: Tim Canavan


We are having a problem with commons configuration 1.6 
AbstractFileConfiguration 

During a stress test we are seeing that we have wait locks against this method 
causing this method not to complete for up to one second.

We are using the FileReloadStrategy delegate which makes a check on the file 
system when now + interval is greater than the compare time.

Why can't we make this check before the synchronized block thus increasing 
throughput. I have noticed in 1.8 that the caller to this method is 
synchronized. This seems like excessive synchronization. Any ideas how to solve 
this. 

public void reload()
{
synchronized (reloadLock)
{
if (noReload == 0)
{
try
{
enterNoReload(); // avoid reentrant calls

if (strategy.reloadingRequired())
{
if (getLogger().isInfoEnabled())
{
getLogger().info("Reloading configuration. URL is " 
+ getURL());
}
fireEvent(EVENT_RELOAD, null, getURL(), true);
setDetailEvents(false);
boolean autoSaveBak = this.isAutoSave(); // save the 
current state
this.setAutoSave(false); // deactivate autoSave to 
prevent information loss
try
{
clear();
load();
}
finally
{
this.setAutoSave(autoSaveBak); // set autoSave to 
previous value
setDetailEvents(true);
}
fireEvent(EVENT_RELOAD, null, getURL(), false);

// notify the strategy
strategy.reloadingPerformed();
}
}
catch (Exception e)
{
fireError(EVENT_RELOAD, null, null, e);
// todo rollback the changes if the file can't be reloaded
}
finally
{
exitNoReload();
}
}
}
}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira