[jira] [Commented] (STATISTICS-71) Implementation of Univariate Statistics

2023-07-03 Thread Alex Herbert (Jira)


[ 
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739479#comment-17739479
 ] 

Alex Herbert commented on STATISTICS-71:


I thought we dropped the requirement for:
{code:java}
long getCount();
{code}
I do not see the need for:
{code:java}
boolean isStoreless();
{code}
Why does the end-user care? What is the function of identifying a statistic as 
storeless? The method is trivial so adding it later would be simple given it 
could be a default method in the interface (return true). Perhaps we leave this 
until implementations are added that are not storeless.

The difference for a stored statistic is that it maintains an array (or other) 
of all observed values as the computation requires all input (e.g. median). If 
this does get implemented then a stored statistic could provide access to the 
values via a child interface, e.g.
{code:java}
public interface StoredDoubleStatistic extends DoubleStatistic {
DoubleStream streamValues();
}
{code}
One issue with the Accumulator is that if you do implement stored statistics 
then you can combine any stored statistic with another. The type will not 
matter and opens the possibility that the implementation can merge the 
underlying storage and share it between the two. This may require merge methods 
specifically for stored statistics.

I am wondering if this is required:

 
{code:java}
Statistic getStatistic();
{code}
I do not think there is a use case for having an instance of a DoubleStatistic, 
not knowing what it is and having to query it. If it is to help building a 
combiner of statistics dynamically then this is an implementation detail and 
not part of the public API.

If you remove the count and storeless flag (for now) you are closer to a 
minimal API. If you remove the getStatistic method then you are left with 
nothing in the DoubleStatistic interface and it becomes a combiner of JDK APIs. 
This is truly minimal and a point to start for an implementation since the 
methods are fixed by the JDK and so rewrites will not be necessary as 
development progress and reveals additional requirements.

Notes:
 * The {{of}} method is a factory constructor and should return a new instance. 
Your implementation is more like {{{}add{}}}.
 * This code does not distinguish -0.0 and 0.0: {{{}if (d < min){}}}. As such 
you have the possibility for multiple implementations of Min. E.g. One using 
the less than operator and one using {{{}Double.min{}}}.
 * Naming conventions: {{{}Statistic.Min => Statistic.MIN{}}}.

> Implementation of Univariate Statistics
> ---
>
> Key: STATISTICS-71
> URL: https://issues.apache.org/jira/browse/STATISTICS-71
> Project: Commons Statistics
>  Issue Type: Task
>  Components: descriptive
>Reporter: Anirudh Joshi
>Priority: Minor
>  Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required 
> for the updated SummaryStatistics API. 
> The implementation would be "storeless". It should be used for calculating 
> statistics that can be computed in one pass through the data without storing 
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue 
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
> DoubleStorelessUnivariateStatistic add(double v);
> long getCount();
> void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [commons-math] orionlibs closed pull request #233: new method that takes string and extracts list of numbers

2023-07-03 Thread via GitHub


orionlibs closed pull request #233: new method that takes string and extracts 
list of numbers
URL: https://github.com/apache/commons-math/pull/233


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-lang] orionlibs commented on pull request #1071: LANG-1695

2023-07-03 Thread via GitHub


orionlibs commented on PR #1071:
URL: https://github.com/apache/commons-lang/pull/1071#issuecomment-1617697366

   thank you Gary. I just registered again for JIRA with the
   email = ***@***.***
   username = dimitrios.efthymiou
   
   On Mon, 3 Jul 2023 at 01:40, Gary Gregory ***@***.***> wrote:
   
   > @garydgregory  I am new to open source
   > contribution and a week ago I started with Apache. I applied for a JIRA
   > account, but I was denied one (with no reason given). So, I cannot login to
   > JIRA. Here is the email from apache: We regret to inform you that, upon
   > reviewing your request for a new Jira account connected with The Apache
   > Software Foundation, the commons project has chosen to deny the request. We
   > therefore will not create the Jira account.
   >
   > The following reason was given: No reason given.
   >
   > If you wish to appeal this decision, you may do so either directly with
   > the commons project, or by contacting the ASF Infrastructure team at:
   > ***@***.***
   >
   > @orionlibs 
   > Please apply again for a Jira account, I'll watch for it.
   >
   > —
   > Reply to this email directly, view it on GitHub
   > ,
   > or unsubscribe
   > 

   > .
   > You are receiving this because you were mentioned.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (STATISTICS-71) Implementation of Univariate Statistics

2023-07-03 Thread Gilles Sadowski (Jira)


[ 
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739512#comment-17739512
 ] 

Gilles Sadowski commented on STATISTICS-71:
---

bq. [...] truly minimal and a point to start for an implementation [...]

Yes!

> Implementation of Univariate Statistics
> ---
>
> Key: STATISTICS-71
> URL: https://issues.apache.org/jira/browse/STATISTICS-71
> Project: Commons Statistics
>  Issue Type: Task
>  Components: descriptive
>Reporter: Anirudh Joshi
>Priority: Minor
>  Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required 
> for the updated SummaryStatistics API. 
> The implementation would be "storeless". It should be used for calculating 
> statistics that can be computed in one pass through the data without storing 
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue 
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
> DoubleStorelessUnivariateStatistic add(double v);
> long getCount();
> void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (LANG-1647) Add and ExceptionUtils.isChecked() and isUnchecked()

2023-07-03 Thread Dimitrios Efthymiou (Jira)


[ 
https://issues.apache.org/jira/browse/LANG-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739523#comment-17739523
 ] 

Dimitrios Efthymiou commented on LANG-1647:
---

The current implementation of 
[https://github.com/apache/commons-lang/pull/1069] sees isUnchecked() as the 
logical opposite of isChecked() which is not true, because the isUnchecked() 
case is missing the null check. I am creating a PR now for it

> Add and ExceptionUtils.isChecked() and isUnchecked()
> 
>
> Key: LANG-1647
> URL: https://issues.apache.org/jira/browse/LANG-1647
> Project: Commons Lang
>  Issue Type: Improvement
>Reporter: Arturo Bernal
>Priority: Minor
> Fix For: 3.13.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The idea it's have a function that check if a given throwable is a checked 
> exception. There are some similar function in ConcurrentUtils, but have 
> package visibility and return the same exception. 
> Seem logic have this verification in the class that have this responsibility 
> - ExceptionUtils



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IMAGING-356) TIFF reading extremely slow in version 1.0-SNAPSHOT

2023-07-03 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739567#comment-17739567
 ] 

Gary Lucas commented on IMAGING-356:


Ha!   Good one.   What I meant to say, of course,  is that it was in the 
package dedicated to the TIFF format...

org.apache.commons.imaging.formats.tiff.

and you can find that specific class under the "data readers" siubdirectory

org.apache.commons.imaging.formats.tiff.datareaders

> TIFF reading extremely slow in version 1.0-SNAPSHOT
> ---
>
> Key: IMAGING-356
> URL: https://issues.apache.org/jira/browse/IMAGING-356
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0
>Reporter: Gary Lucas
>Priority: Major
>
> I am using the latest code from github (1.0-SNAPSHOT downloaded from github 
> of June 2023) to read a 300 megabyte TIFF file.  Version 1.0-alpha3 required 
> 673 milliseconds to read that file.  The new code requires upward of 15 
> minutes.   Clearly something got broken since the last release.
> The TIFF file is a 1x1 pixel 4 byte image format organized in strips. 
>  The bottleneck appears to occur in the TiffReader getTiffRawImageData method 
> which reads raw data from the file in preparation of creating a BufferedImage 
> object.
> I suspect that there may be a general slowness of file access.  In debugging, 
> even reading the initial metadata (22 TIFF tags) took a couple of seconds.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [commons-lang] garydgregory merged pull request #1079: LANG-1647-correction-of-isUnchecked-method

2023-07-03 Thread via GitHub


garydgregory merged PR #1079:
URL: https://github.com/apache/commons-lang/pull/1079


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-lang] garydgregory commented on pull request #1079: LANG-1647-correction-of-isUnchecked-method

2023-07-03 Thread via GitHub


garydgregory commented on PR #1079:
URL: https://github.com/apache/commons-lang/pull/1079#issuecomment-1617940185

   @orionlibs 
   In the future, or even now, please add a link in the description to the Jira 
ticket if there is one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-lang] garydgregory commented on pull request #1079: LANG-1647-correction-of-isUnchecked-method

2023-07-03 Thread via GitHub


garydgregory commented on PR #1079:
URL: https://github.com/apache/commons-lang/pull/1079#issuecomment-1617943538

   Also don't use whatever is this "--" notation. GH will render lists with "-".


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (LANG-1647) Add and ExceptionUtils.isChecked() and isUnchecked()

2023-07-03 Thread Gary D. Gregory (Jira)


 [ 
https://issues.apache.org/jira/browse/LANG-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary D. Gregory closed LANG-1647.
-

> Add and ExceptionUtils.isChecked() and isUnchecked()
> 
>
> Key: LANG-1647
> URL: https://issues.apache.org/jira/browse/LANG-1647
> Project: Commons Lang
>  Issue Type: Improvement
>Reporter: Arturo Bernal
>Priority: Minor
> Fix For: 3.13.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The idea it's have a function that check if a given throwable is a checked 
> exception. There are some similar function in ConcurrentUtils, but have 
> package visibility and return the same exception. 
> Seem logic have this verification in the class that have this responsibility 
> - ExceptionUtils



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Deleted] (LANG-1647) Add and ExceptionUtils.isChecked() and isUnchecked()

2023-07-03 Thread Gary D. Gregory (Jira)


 [ 
https://issues.apache.org/jira/browse/LANG-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary D. Gregory deleted LANG-1647:
--


> Add and ExceptionUtils.isChecked() and isUnchecked()
> 
>
> Key: LANG-1647
> URL: https://issues.apache.org/jira/browse/LANG-1647
> Project: Commons Lang
>  Issue Type: Improvement
>Reporter: Arturo Bernal
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The idea it's have a function that check if a given throwable is a checked 
> exception. There are some similar function in ConcurrentUtils, but have 
> package visibility and return the same exception. 
> Seem logic have this verification in the class that have this responsibility 
> - ExceptionUtils



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (LANG-1647) Add and ExceptionUtils.isChecked() and isUnchecked()

2023-07-03 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/LANG-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739583#comment-17739583
 ] 

Gary D. Gregory commented on LANG-1647:
---

This ticker creates noise and is not needed, there is ALREADY a ticket for this 
new API.

> Add and ExceptionUtils.isChecked() and isUnchecked()
> 
>
> Key: LANG-1647
> URL: https://issues.apache.org/jira/browse/LANG-1647
> Project: Commons Lang
>  Issue Type: Improvement
>Reporter: Arturo Bernal
>Priority: Minor
> Fix For: 3.13.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The idea it's have a function that check if a given throwable is a checked 
> exception. There are some similar function in ConcurrentUtils, but have 
> package visibility and return the same exception. 
> Seem logic have this verification in the class that have this responsibility 
> - ExceptionUtils



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [commons-lang] theshoeshiner commented on pull request #1062: Syntax for optional tokens in DurationFormatUtils

2023-07-03 Thread via GitHub


theshoeshiner commented on PR #1062:
URL: https://github.com/apache/commons-lang/pull/1062#issuecomment-1618093846

   @aherbert 
   
   Committed 
https://github.com/apache/commons-lang/pull/1062/commits/64f4ffd95e10d2d93ba4da3fcc16d1ca5afc76eb
 with updated javadoc to explain optional syntax. Added a few additional tests 
for handling of literals and testing the `formatPeriod` methods.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-lang] garydgregory commented on pull request #1062: Syntax for optional tokens in DurationFormatUtils

2023-07-03 Thread via GitHub


garydgregory commented on PR #1062:
URL: https://github.com/apache/commons-lang/pull/1062#issuecomment-1618232622

   Committers:
   Run `mvn` (just that) locally before you push to avoid these types of build 
failures.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (IMAGING-355) Large animated GIF takes too much heap memory in getMetadata

2023-07-03 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739614#comment-17739614
 ] 

Gary D. Gregory commented on IMAGING-355:
-

[~akhoury]
Thank you for your report.

Would you try building and using git master, or, use a snapshot build and see 
if it helps? 

Snapshot builds are here: 
https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-imaging/1.0-SNAPSHOT/


> Large animated GIF takes too much heap memory in getMetadata
> 
>
> Key: IMAGING-355
> URL: https://issues.apache.org/jira/browse/IMAGING-355
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: GIF
>Affects Versions: 1.0-alpha3
>Reporter: Andrew Khoury
>Priority: Major
> Attachments: commons-imaging-test.gif, 
> image-2023-06-29-15-18-17-076.png, screenshot-1.png
>
>
> When calling ImageParser.getMetadata on large animated gif files, the java 
> heap consumption is extremely high.
> For example, see the test project I created:
> [https://github.com/andrewmkhoury/commons-imaging-gif-test]
> When calling ImageParser.getMetadata on the attached 5MB gif 
> [^commons-imaging-test.gif], it uses ~1.5GB of heap space.  When the max heap 
> is set to -Xmx1488M or lower it fails with this exception. When the heap is 
> set to -Xmx1489M it works.
> {code:java}
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
>   at 
> java.base/java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:211)
>   at 
> org.apache.commons.imaging.mylzw.MyLzwDecompressor.decompress(MyLzwDecompressor.java:143)
>   at 
> org.apache.commons.imaging.formats.gif.GifImageParser.readImageDescriptor(GifImageParser.java:881)
>   at 
> org.apache.commons.imaging.formats.gif.GifImageParser.readBlocks(GifImageParser.java:596)
>   at 
> org.apache.commons.imaging.formats.gif.GifImageParser.readFile(GifImageParser.java:696)
>   at 
> org.apache.commons.imaging.formats.gif.GifImageParser.readFile(GifImageParser.java:680)
>   at 
> org.apache.commons.imaging.formats.gif.GifImageParser.getMetadata(GifImageParser.java:485)
>   at 
> org.apache.commons.imaging.formats.gif.GifImageParser.getMetadata(GifImageParser.java:58)
>   at 
> org.apache.commons.imaging.ImageParser.getMetadata(ImageParser.java:832)
>   at Test.main(Test.java:28)
> {code}
> To generate the large gif file I did the following:
> 1. Install ffmpeg and gifsicle
> 2. Use quicktime to create a screen recording
> 3. Generate a gif out of the screen recording
> {code:bash}
> ffmpeg -i ~/Desktop/Screen\ Recording\ 2023-06-29\ at\ 12.09.21\ PM.mov 
> -pix_fmt rgb8 -r 10 commons-imaging-test.gif && gifsicle -O3 
> commons-imaging-test.gif -o commons-imaging-test.gif
> {code}
> To run the test program:
> {code:bash}
> git clone g...@github.com:andrewmkhoury/commons-imaging-gif-test.git
> cd commons-imaging-gif-test
> mvn assembly:assembly
> java -Xmx1g -jar target/IMAGING-test-0.0.1-SNAPSHOT-jar-with-dependencies.jar 
> commons-imaging-test.gif{code}
> Heap analysis via Eclipse MAT shows that the ImageDescriptor.imageData 
> storing the bytes of each frame is the cause of the problem:
> !image-2023-06-29-15-18-17-076.png|width=169,height=93!!screenshot-1.png|width=226,height=86!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IMAGING-356) TIFF reading extremely slow in version 1.0-SNAPSHOT

2023-07-03 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739636#comment-17739636
 ] 

Gary D. Gregory commented on IMAGING-356:
-

Hi [~gwlucas]
Please try again with git master or a snapshot build. My avg time went from 
1451 ms to 28 ms.


> TIFF reading extremely slow in version 1.0-SNAPSHOT
> ---
>
> Key: IMAGING-356
> URL: https://issues.apache.org/jira/browse/IMAGING-356
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0
>Reporter: Gary Lucas
>Priority: Major
>
> I am using the latest code from github (1.0-SNAPSHOT downloaded from github 
> of June 2023) to read a 300 megabyte TIFF file.  Version 1.0-alpha3 required 
> 673 milliseconds to read that file.  The new code requires upward of 15 
> minutes.   Clearly something got broken since the last release.
> The TIFF file is a 1x1 pixel 4 byte image format organized in strips. 
>  The bottleneck appears to occur in the TiffReader getTiffRawImageData method 
> which reads raw data from the file in preparation of creating a BufferedImage 
> object.
> I suspect that there may be a general slowness of file access.  In debugging, 
> even reading the initial metadata (22 TIFF tags) took a couple of seconds.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (STATISTICS-71) Implementation of Univariate Statistics

2023-07-03 Thread Anirudh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739654#comment-17739654
 ] 

Anirudh Joshi commented on STATISTICS-71:
-

{quote}[. . .] if you do implement stored statistics then you can combine any 
stored statistic with another. The type will not matter [. . .] 

{quote}

> Implementation of Univariate Statistics
> ---
>
> Key: STATISTICS-71
> URL: https://issues.apache.org/jira/browse/STATISTICS-71
> Project: Commons Statistics
>  Issue Type: Task
>  Components: descriptive
>Reporter: Anirudh Joshi
>Priority: Minor
>  Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required 
> for the updated SummaryStatistics API. 
> The implementation would be "storeless". It should be used for calculating 
> statistics that can be computed in one pass through the data without storing 
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue 
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
> DoubleStorelessUnivariateStatistic add(double v);
> long getCount();
> void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (STATISTICS-71) Implementation of Univariate Statistics

2023-07-03 Thread Anirudh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739654#comment-17739654
 ] 

Anirudh Joshi edited comment on STATISTICS-71 at 7/3/23 3:18 PM:
-

{quote}[. . .] if you do implement stored statistics then you can combine any 
stored statistic with another. The type will not matter [. . .]
{quote}
Do you think its more clear if I rename StatisticAccumulator to 
TypeSafeStatisticAccumulator for now ? Also, when we implement stored 
statistics in the future we could have another accumulator interface 
StoredStatisticAccumulator as follows


{code:java}
public interface StoredStatisticAccumulator { 
void combine(StoredDoubleStatisitc other);
}{code}
And have the statistic impl classes implement both TypeSafeStatisticAccumulator 
and StoredStatisticAccumulator. We could have the "stored" flavors support both 
type sage combine and unbounded combine. We could have the "storeless" flavor 
throw unsupported operation exception if combine(StoredDoubleStatisitc other) 
is invoked on the instance ?

 


was (Author: JIRAUSER299640):
{quote}[. . .] if you do implement stored statistics then you can combine any 
stored statistic with another. The type will not matter [. . .] 

{quote}

> Implementation of Univariate Statistics
> ---
>
> Key: STATISTICS-71
> URL: https://issues.apache.org/jira/browse/STATISTICS-71
> Project: Commons Statistics
>  Issue Type: Task
>  Components: descriptive
>Reporter: Anirudh Joshi
>Priority: Minor
>  Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required 
> for the updated SummaryStatistics API. 
> The implementation would be "storeless". It should be used for calculating 
> statistics that can be computed in one pass through the data without storing 
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue 
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
> DoubleStorelessUnivariateStatistic add(double v);
> long getCount();
> void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (STATISTICS-71) Implementation of Univariate Statistics

2023-07-03 Thread Anirudh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739654#comment-17739654
 ] 

Anirudh Joshi edited comment on STATISTICS-71 at 7/3/23 3:21 PM:
-

{quote}[. . .] if you do implement stored statistics then you can combine any 
stored statistic with another. The type will not matter [. . .]
{quote}
Do you think its more clear if I rename StatisticAccumulator to 
TypeSafeStatisticAccumulator for now ? Also, when we implement stored 
statistics in the future we could have another accumulator interface 
StoredStatisticAccumulator as follows
{code:java}
public interface StoredStatisticAccumulator { 
void combine(StoredDoubleStatistic other);
}{code}
And have the statistic impl classes implement both TypeSafeStatisticAccumulator 
and StoredStatisticAccumulator. We could have the "stored" flavors support both 
type safe combine and unbounded combine. We could have the "storeless" flavor 
throw unsupported operation exception if combine(StoredDoubleStatisitc other) 
is invoked on the instance ?

 


was (Author: JIRAUSER299640):
{quote}[. . .] if you do implement stored statistics then you can combine any 
stored statistic with another. The type will not matter [. . .]
{quote}
Do you think its more clear if I rename StatisticAccumulator to 
TypeSafeStatisticAccumulator for now ? Also, when we implement stored 
statistics in the future we could have another accumulator interface 
StoredStatisticAccumulator as follows


{code:java}
public interface StoredStatisticAccumulator { 
void combine(StoredDoubleStatisitc other);
}{code}
And have the statistic impl classes implement both TypeSafeStatisticAccumulator 
and StoredStatisticAccumulator. We could have the "stored" flavors support both 
type sage combine and unbounded combine. We could have the "storeless" flavor 
throw unsupported operation exception if combine(StoredDoubleStatisitc other) 
is invoked on the instance ?

 

> Implementation of Univariate Statistics
> ---
>
> Key: STATISTICS-71
> URL: https://issues.apache.org/jira/browse/STATISTICS-71
> Project: Commons Statistics
>  Issue Type: Task
>  Components: descriptive
>Reporter: Anirudh Joshi
>Priority: Minor
>  Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required 
> for the updated SummaryStatistics API. 
> The implementation would be "storeless". It should be used for calculating 
> statistics that can be computed in one pass through the data without storing 
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue 
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
> DoubleStorelessUnivariateStatistic add(double v);
> long getCount();
> void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (STATISTICS-71) Implementation of Univariate Statistics

2023-07-03 Thread Alex Herbert (Jira)


[ 
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739659#comment-17739659
 ] 

Alex Herbert commented on STATISTICS-71:


{quote}Do you think its more clear if I rename StatisticAccumulator to 
TypeSafeStatisticAccumulator for now ?
{quote}
It is over-verbose. The functionality bound to a type  should make it 
apparent that it is typed. I would just go with a minimum interface and get 
going on creating an implementation.

The details of stored statistics can be addressed when they are implemented.

 

> Implementation of Univariate Statistics
> ---
>
> Key: STATISTICS-71
> URL: https://issues.apache.org/jira/browse/STATISTICS-71
> Project: Commons Statistics
>  Issue Type: Task
>  Components: descriptive
>Reporter: Anirudh Joshi
>Priority: Minor
>  Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required 
> for the updated SummaryStatistics API. 
> The implementation would be "storeless". It should be used for calculating 
> statistics that can be computed in one pass through the data without storing 
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue 
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
> DoubleStorelessUnivariateStatistic add(double v);
> long getCount();
> void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IMAGING-356) TIFF reading extremely slow in version 1.0-SNAPSHOT

2023-07-03 Thread Gary Lucas (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739752#comment-17739752
 ] 

Gary Lucas commented on IMAGING-356:


Nice job, it will be interesting to see what you did.

Running your new code on PICT2833.TIF, the speed and memory test yielded

{code:java}
 time to load image-- memory
 time ms  avg ms   --used mb   total mb
[snip...] 
   22.93723.390-- 48.259   479.500 
{code}

The "avg ms" is the average time in milliseconds to load the image.  The way 
"speed and memory" works is that it loads the image 10 times.  The first three 
times are ignored by the tabulation (the idea being that they include overhead 
for the classloader and JIT compiler).  The worst of the 10 trials is also 
ignored.  The first column gives the time for the current test, but it really 
doesn't mean much (I included it only to see if something abnormal happens 
during the test).

So that was the good news and it is an outstanding improvement.  I am a bit 
reluctant to mention it, but there might still be an opportunity for a small 
improvement.  Looking at the same test using the previous version of the code 
(before Commons IO was integrated into it), I see the following results.


{code:java}
 time to load image-- memory
 time ms  avg ms   --used mb   total mb
   19.81520.424-- 33.896   479.500
{code}

So the previous version used less memory and took less time to operate.  I 
repeated the test several times.  That being said, the previous version was 
1.0-alpha3 which I downloaded and compiled 24 May 2022. Could the difference be 
somehow related to changes in the pom.xml?

Also, I have not looked at other formats or even other TIFF files at this time. 
 I will do a bit more testing with some of the 300 MB aerial photograph files I 
use and will let you know if I find anything noteworthy.


> TIFF reading extremely slow in version 1.0-SNAPSHOT
> ---
>
> Key: IMAGING-356
> URL: https://issues.apache.org/jira/browse/IMAGING-356
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0
>Reporter: Gary Lucas
>Priority: Major
>
> I am using the latest code from github (1.0-SNAPSHOT downloaded from github 
> of June 2023) to read a 300 megabyte TIFF file.  Version 1.0-alpha3 required 
> 673 milliseconds to read that file.  The new code requires upward of 15 
> minutes.   Clearly something got broken since the last release.
> The TIFF file is a 1x1 pixel 4 byte image format organized in strips. 
>  The bottleneck appears to occur in the TiffReader getTiffRawImageData method 
> which reads raw data from the file in preparation of creating a BufferedImage 
> object.
> I suspect that there may be a general slowness of file access.  In debugging, 
> even reading the initial metadata (22 TIFF tags) took a couple of seconds.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IMAGING-356) TIFF reading extremely slow in version 1.0-SNAPSHOT

2023-07-03 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/IMAGING-356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739753#comment-17739753
 ] 

Gary D. Gregory commented on IMAGING-356:
-

Hi [~gwlucas]

I'll take a look with VisualVM again in the AM.

> TIFF reading extremely slow in version 1.0-SNAPSHOT
> ---
>
> Key: IMAGING-356
> URL: https://issues.apache.org/jira/browse/IMAGING-356
> Project: Commons Imaging
>  Issue Type: Bug
>  Components: Format: TIFF
>Affects Versions: 1.0
>Reporter: Gary Lucas
>Priority: Major
>
> I am using the latest code from github (1.0-SNAPSHOT downloaded from github 
> of June 2023) to read a 300 megabyte TIFF file.  Version 1.0-alpha3 required 
> 673 milliseconds to read that file.  The new code requires upward of 15 
> minutes.   Clearly something got broken since the last release.
> The TIFF file is a 1x1 pixel 4 byte image format organized in strips. 
>  The bottleneck appears to occur in the TiffReader getTiffRawImageData method 
> which reads raw data from the file in preparation of creating a BufferedImage 
> object.
> I suspect that there may be a general slowness of file access.  In debugging, 
> even reading the initial metadata (22 TIFF tags) took a couple of seconds.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)