[GitHub] [commons-io] garydgregory commented on a diff in pull request #358: Add test for AbstractByteArrayOutputStream.resetImpl and .toByteArrayImpl

2022-05-22 Thread GitBox


garydgregory commented on code in PR #358:
URL: https://github.com/apache/commons-io/pull/358#discussion_r878919402


##
src/test/java/org/apache/commons/io/output/ByteArrayOutputStreamTest.java:
##
@@ -350,6 +360,22 @@ public void testWriteZero(final String baosName, final 
BAOSFactory baosFactor
 }
 }
 
+private static final int FILE_SIZE = (1024 * 4) + 1;
+private final byte[] inData = TestUtils.generateTestData(FILE_SIZE);
+
+@Test
+public void testToByteArrayImplAndResetImpl() throws Exception {
+InputStream in = new ByteArrayInputStream(inData);

Review Comment:
   Use try with resources when possible.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-io] garydgregory commented on a diff in pull request #358: Add test for AbstractByteArrayOutputStream.resetImpl and .toByteArrayImpl

2022-05-22 Thread GitBox


garydgregory commented on code in PR #358:
URL: https://github.com/apache/commons-io/pull/358#discussion_r878919402


##
src/test/java/org/apache/commons/io/output/ByteArrayOutputStreamTest.java:
##
@@ -350,6 +360,22 @@ public void testWriteZero(final String baosName, final 
BAOSFactory baosFactor
 }
 }
 
+private static final int FILE_SIZE = (1024 * 4) + 1;
+private final byte[] inData = TestUtils.generateTestData(FILE_SIZE);
+
+@Test
+public void testToByteArrayImplAndResetImpl() throws Exception {
+InputStream in = new ByteArrayInputStream(inData);

Review Comment:
   Use try with resources
   



##
src/test/java/org/apache/commons/io/output/ByteArrayOutputStreamTest.java:
##
@@ -350,6 +360,22 @@ public void testWriteZero(final String baosName, final 
BAOSFactory baosFactor
 }
 }
 
+private static final int FILE_SIZE = (1024 * 4) + 1;
+private final byte[] inData = TestUtils.generateTestData(FILE_SIZE);

Review Comment:
   These data do not need to be tracked for all tests. Move them inside the one 
method that use them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-lang] kinow commented on pull request #897: Revert #896 spotbugs update

2022-05-22 Thread GitBox


kinow commented on PR #897:
URL: https://github.com/apache/commons-lang/pull/897#issuecomment-1133955319

   Thanks!!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-lang] garydgregory commented on pull request #897: Revert #896 spotbugs update

2022-05-22 Thread GitBox


garydgregory commented on PR #897:
URL: https://github.com/apache/commons-lang/pull/897#issuecomment-1133935315

   All good now 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (RNG-179) The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions

2022-05-22 Thread Alex Herbert (Jira)


[ 
https://issues.apache.org/jira/browse/RNG-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17540619#comment-17540619
 ] 

Alex Herbert commented on RNG-179:
--

Thanks for the suggestion. The provided paper shows it may be faster with a 
lower entropy distribution. It certainly uses less bits for each sample. The 
memory usage seems to be higher although initialisation cost is lower.

I can have a look at implementing this for the distributions specified using 
integer ratios:
{code:java}
UniformRandomProvider rng = ...;
int[] distribution = {1, 1, 2, 3, 1};
DiscreteSampler sampler = FastLoadedDiceRoller.of(rng, distribution);
{code}

> The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete 
> Probability Distributions
> 
>
> Key: RNG-179
> URL: https://issues.apache.org/jira/browse/RNG-179
> Project: Commons RNG
>  Issue Type: New Feature
>  Components: sampling
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> It might make sense to implement Fast Loaded Dice Roller sampler.
> See https://arxiv.org/pdf/2003.03830v2.pdf, 
> https://github.com/probcomp/fast-loaded-dice-roller
> The authors claim that Fast Loaded Dice Roller is faster than the Alias 
> Method.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (RNG-179) The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions

2022-05-22 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created RNG-179:
-

 Summary: The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler 
for Discrete Probability Distributions
 Key: RNG-179
 URL: https://issues.apache.org/jira/browse/RNG-179
 Project: Commons RNG
  Issue Type: New Feature
  Components: sampling
Reporter: Vladimir Sitnikov


It might make sense to implement Fast Loaded Dice Roller sampler.
See https://arxiv.org/pdf/2003.03830v2.pdf, 
https://github.com/probcomp/fast-loaded-dice-roller

The authors claim that Fast Loaded Dice Roller is faster than the Alias Method.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (CSV-285) Replace BufferedReader with PushbackReader

2022-05-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CSV-285?focusedWorklogId=773205=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773205
 ]

ASF GitHub Bot logged work on CSV-285:
--

Author: ASF GitHub Bot
Created on: 22/May/22 12:13
Start Date: 22/May/22 12:13
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request, #169:
URL: https://github.com/apache/commons-csv/pull/169

   https://issues.apache.org/jira/browse/CSV-285




Issue Time Tracking
---

Worklog Id: (was: 773205)
Time Spent: 1h 50m  (was: 1h 40m)

> Replace BufferedReader with PushbackReader
> --
>
> Key: CSV-285
> URL: https://issues.apache.org/jira/browse/CSV-285
> Project: Commons CSV
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {{commons-csv}} uses, as its base {{Reader}} a {{BufferedReader}}, however 
> the more natural choice is {{PushBackReader}}.
> {quote}
> This is useful in situations where it is convenient for a fragment of code to 
> read an indefinite number of data bytes that are delimited by a particular 
> byte value; after reading the terminating byte, the code fragment can 
> "unread" it, so that the next read operation on the input stream will reread 
> the byte that was pushed back. For example, bytes representing the characters 
> constituting an identifier might be terminated by a byte representing an 
> operator character; a method whose job is to read just an identifier can read 
> until it sees the operator and then push the operator back to be re-read.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/io/PushbackInputStream.html
> {{commons-csv}} currently implements these "pushback" features on top of a 
> {{BufferedReader}}.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (CSV-285) Replace BufferedReader with PushbackReader

2022-05-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CSV-285?focusedWorklogId=773204=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773204
 ]

ASF GitHub Bot logged work on CSV-285:
--

Author: ASF GitHub Bot
Created on: 22/May/22 12:13
Start Date: 22/May/22 12:13
Worklog Time Spent: 10m 
  Work Description: garydgregory closed pull request #169: CSV-285: Replace 
BufferedReader with PushbackReader
URL: https://github.com/apache/commons-csv/pull/169




Issue Time Tracking
---

Worklog Id: (was: 773204)
Time Spent: 1h 40m  (was: 1.5h)

> Replace BufferedReader with PushbackReader
> --
>
> Key: CSV-285
> URL: https://issues.apache.org/jira/browse/CSV-285
> Project: Commons CSV
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {{commons-csv}} uses, as its base {{Reader}} a {{BufferedReader}}, however 
> the more natural choice is {{PushBackReader}}.
> {quote}
> This is useful in situations where it is convenient for a fragment of code to 
> read an indefinite number of data bytes that are delimited by a particular 
> byte value; after reading the terminating byte, the code fragment can 
> "unread" it, so that the next read operation on the input stream will reread 
> the byte that was pushed back. For example, bytes representing the characters 
> constituting an identifier might be terminated by a byte representing an 
> operator character; a method whose job is to read just an identifier can read 
> until it sees the operator and then push the operator back to be re-read.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/io/PushbackInputStream.html
> {{commons-csv}} currently implements these "pushback" features on top of a 
> {{BufferedReader}}.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (CSV-285) Replace BufferedReader with PushbackReader

2022-05-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CSV-285?focusedWorklogId=773203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773203
 ]

ASF GitHub Bot logged work on CSV-285:
--

Author: ASF GitHub Bot
Created on: 22/May/22 12:13
Start Date: 22/May/22 12:13
Worklog Time Spent: 10m 
  Work Description: garydgregory commented on PR #169:
URL: https://github.com/apache/commons-csv/pull/169#issuecomment-1133882770

   @belugabehr 




Issue Time Tracking
---

Worklog Id: (was: 773203)
Time Spent: 1.5h  (was: 1h 20m)

> Replace BufferedReader with PushbackReader
> --
>
> Key: CSV-285
> URL: https://issues.apache.org/jira/browse/CSV-285
> Project: Commons CSV
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {{commons-csv}} uses, as its base {{Reader}} a {{BufferedReader}}, however 
> the more natural choice is {{PushBackReader}}.
> {quote}
> This is useful in situations where it is convenient for a fragment of code to 
> read an indefinite number of data bytes that are delimited by a particular 
> byte value; after reading the terminating byte, the code fragment can 
> "unread" it, so that the next read operation on the input stream will reread 
> the byte that was pushed back. For example, bytes representing the characters 
> constituting an identifier might be terminated by a byte representing an 
> operator character; a method whose job is to read just an identifier can read 
> until it sees the operator and then push the operator back to be re-read.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/io/PushbackInputStream.html
> {{commons-csv}} currently implements these "pushback" features on top of a 
> {{BufferedReader}}.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [commons-csv] garydgregory commented on pull request #169: CSV-285: Replace BufferedReader with PushbackReader

2022-05-22 Thread GitBox


garydgregory commented on PR #169:
URL: https://github.com/apache/commons-csv/pull/169#issuecomment-1133882770

   @belugabehr 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-csv] belugabehr opened a new pull request, #169: CSV-285: Replace BufferedReader with PushbackReader

2022-05-22 Thread GitBox


belugabehr opened a new pull request, #169:
URL: https://github.com/apache/commons-csv/pull/169

   https://issues.apache.org/jira/browse/CSV-285


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-csv] garydgregory closed pull request #169: CSV-285: Replace BufferedReader with PushbackReader

2022-05-22 Thread GitBox


garydgregory closed pull request #169: CSV-285: Replace BufferedReader with 
PushbackReader
URL: https://github.com/apache/commons-csv/pull/169


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (CSV-164) Support duplicate header names

2022-05-22 Thread Gary D. Gregory (Jira)


 [ 
https://issues.apache.org/jira/browse/CSV-164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary D. Gregory resolved CSV-164.
-
Fix Version/s: 1.10.0
   Resolution: Fixed

See also CSV-264

> Support duplicate header names
> --
>
> Key: CSV-164
> URL: https://issues.apache.org/jira/browse/CSV-164
> Project: Commons CSV
>  Issue Type: Bug
>Affects Versions: 1.2
>Reporter: Romain Manni-Bucau
>Priority: Major
> Fix For: 1.10.0
>
>
> nothing prevents a CSV to have the same time the same header name so 
> validation at the end of org.apache.commons.csv.CSVFormat#validate should 
> likely disappear or should support a flag to disable it



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (CSV-296) Delimiter followed by Whitespace then by Quotes Failing with setTrim(true)

2022-05-22 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/CSV-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17540582#comment-17540582
 ] 

Gary D. Gregory commented on CSV-296:
-

Trimming is only used on _values_ after parsing such that {{" foo "}} becomes 
{{{}"foo"{}}}. It seems you are dealing with a parsing issue, not a trimming 
issue.

> Delimiter followed by Whitespace then by Quotes Failing with setTrim(true)
> --
>
> Key: CSV-296
> URL: https://issues.apache.org/jira/browse/CSV-296
> Project: Commons CSV
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.8, 1.9.0
> Environment: +{*}macOS{*}:+
> {code:java}
> > uname -a
> Darwin Senzing-MacBook-Pro.local 21.4.0 Darwin Kernel Version 21.4.0: Fri Mar 
> 18 00:45:05 PDT 2022; root:xnu-8020.101.4~15/RELEASE_X86_64 x86_64 {code}
> {code:java}
> > java -version
> openjdk version "11.0.14" 2022-01-18
> OpenJDK Runtime Environment Temurin-11.0.14+9 (build 11.0.14+9)
> OpenJDK 64-Bit Server VM Temurin-11.0.14+9 (build 11.0.14+9, mixed mode) 
> {code}
> {+}*Linux*{+}:
> {code:java}
> > uname -a
> Linux lnxdev 5.4.0-109-generic #123-Ubuntu SMP Fri Apr 8 09:10:54 UTC 2022 
> x86_64 x86_64 x86_64 GNU/Linux {code}
> {code:java}
> > java -version
> openjdk version "11.0.11" 2021-04-20
> OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
> OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed 
> mode){code}
>Reporter: Barry M. Caceres
>Priority: Major
> Attachments: csvfail.zip
>
>
> I have my CSVFormat initialized such that *{{withTrim(true)}}* has been set 
> {_}(see attached ZIP file){_}:
> {code:java}
> CSVFormat csvFormat = CSVFormat.DEFAULT.withFirstRecordAsHeader()
>         .withIgnoreEmptyLines(true).withTrim(true);{code}
>  
> However, a quoted string that begins after a delimiter followed by preceding 
> whitespace is not properly parsed. For example:
> {code:java}
> GIVEN_NAME,SURNAME,ADDRESS,PHONE_NUMBER
> "Joe",  "Schmoe","101 Main Street; Las Vegas, NV 89101","702-555-1212"
> "John","Doe",  "201 First Street; Las Vegas, NV 89102", "702-555-1313"
> "Jane","Doe","301 Second Street; Las Vegas, NV 89103","702-555-1414"
> {code}
>  
>  * Notice the whitespace preceding {color:#0747a6}*{{"Schmoe"}}*{color} on 
> the first record?  This leads to the actual value containing the quotation 
> marks instead of them being stripped off.
>  * The whitespace preceding {color:#0747a6}*{{"201 First Street; Las Vegas, 
> NV 89102"}}*{color} on the second record leads to it to being parsed as two 
> values: {color:#0747a6}*{{"201 First Street; Las Vegas}}*{color} and {*}{{NV 
> 89102"}}{*}.
>  * The third record is the only one that parses as expected.
> I believe that this is because the trimming is done *after* the value is 
> being parsed rather than consuming the whitespace following the delimiter 
> during parsing.   Either that, or the check for a quoted string is occurring 
> *before* the whitespace is being consumed.
>  
> *NOTE:* I have attached a ZIP file that easily reproduces the problem with 
> the CSV file given above.
> To build the attached project use Apache Maven and then execute using using 
> Java 11:
> {code:java}
> > unzip csvfail.zip
> > cd csvfail
> > mvn package
> > java -jar target/csv-fail-1.0-SNAPSHOT.jar{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [commons-lang] garydgregory merged pull request #897: Revert #896 spotbugs update

2022-05-22 Thread GitBox


garydgregory merged PR #897:
URL: https://github.com/apache/commons-lang/pull/897


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-text] garydgregory commented on a diff in pull request #328: Revert #327 change (checkstyle 9.3 to 10.2) due to JVM incompatibility. Fix GH Actions to run checkstyle.

2022-05-22 Thread GitBox


garydgregory commented on code in PR #328:
URL: https://github.com/apache/commons-text/pull/328#discussion_r878841984


##
.github/workflows/maven.yml:
##
@@ -42,4 +42,4 @@ jobs:
 java-version: ${{ matrix.java }}
 cache: 'maven'
 - name: Build with Maven
-  run: mvn -V apache-rat:check spotbugs:check javadoc:javadoc 
-Ddoclint=all package --file pom.xml --no-transfer-progress
+  run: mvn

Review Comment:
   Yes, we want to run the default goal :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [commons-text] garydgregory merged pull request #328: Revert #327 change (checkstyle 9.3 to 10.2) due to JVM incompatibility. Fix GH Actions to run checkstyle.

2022-05-22 Thread GitBox


garydgregory merged PR #328:
URL: https://github.com/apache/commons-text/pull/328


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org