[GitHub] [commons-compress] PeterAlfreadLee commented on issue #90: Compress-477 : add zip64 support for split zip
PeterAlfreadLee commented on issue #90: Compress-477 : add zip64 support for split zip URL: https://github.com/apache/commons-compress/pull/90#issuecomment-581281141 > This is not really a new issue introduced with the split zips feature, MultiReadOnlySeekableByteChannel has been keeping channels open before as well. Yes, you're right. This is not a new issue with split zips. And as we all know that `MultiReadOnlySeekableByteChannel` is also used in 7zip split segments, I believe it's also affected by this. > OTOH I expect most split archives to only have a pretty small number of parts, an archive with more than 100 parts is something that I don't expect to be common. Therefore making sure we don't open too many streams is probably not that important at all. I agree. Actually I never meet any split zips with more than 20 segments. This is nothing common but only a special use case that's introduced in the zip specification. I agree that it's a extremely special use case and may not be used in most cases. Actually I have already finished this. So I will push the PR and it's up to you to decide whether to merge it or not. @bodewig This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (BEANUTILS-532) Require commons-beanutils library which supports commons-collections-4.1 version
[ https://issues.apache.org/jira/browse/BEANUTILS-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028703#comment-17028703 ] AvanthikaNC commented on BEANUTILS-532: --- {color:#172b4d}Hi Gary D. Gregory,{color} {color:#172b4d}We donot find the library available in the snapshot repository that you have provided us even though, we cannot release it to production as you mentioned it needs a bit more work to complete the required library.{color} {color:#172b4d}Could you please provide us the Tentative date for Commons BeanUtils version supporting Commons Collections 4.x.library.{color} {color:#172b4d}Thanks{color} > Require commons-beanutils library which supports commons-collections-4.1 > version > - > > Key: BEANUTILS-532 > URL: https://issues.apache.org/jira/browse/BEANUTILS-532 > Project: Commons BeanUtils > Issue Type: Bug > Components: Bean-Collections >Reporter: AvanthikaNC >Priority: Blocker > Attachments: image-2020-01-31-14-52-43-114.png > > > Hi Team, > > We are working on ATM SWITCH project and the project currently uses > commons-beanutils library 1.9.4 and we have upgraded to > commons-collections-4.1 as part of our project requirement as it contained > vulnerabilities. > We are facing some errors due to the above mentioned upgrade as > commons-beanutils library 1.9.4 will support commons-collections 3.2.2 > version. > Now as per our requirement we cannot downgrade common-collections library but > we need commons-beanutils library which supports commons-collections4-4.1 > version. > Please provide your response asap. > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (COLLECTIONS-737) The test FluentIterableTest.size should be split
[ https://issues.apache.org/jira/browse/COLLECTIONS-737?focusedWorklogId=380558=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380558 ] ASF GitHub Bot logged work on COLLECTIONS-737: -- Author: ASF GitHub Bot Created on: 02/Feb/20 16:56 Start Date: 02/Feb/20 16:56 Worklog Time Spent: 10m Work Description: Prodigysov commented on issue #120: [COLLECTIONS-737] The test FluentIterableTest.size should be splitted URL: https://github.com/apache/commons-collections/pull/120#issuecomment-581154786 Sounds good, thanks for the review! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 380558) Time Spent: 1h (was: 50m) > The test FluentIterableTest.size should be split > > > Key: COLLECTIONS-737 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-737 > Project: Commons Collections > Issue Type: Test > Components: Collection >Reporter: Pengyu Nie >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > The first part of FluentIterableTest.size is not testing function > FluentIterable.size (see code copied below). Actually, > FluentIterable.size will not be invoked at all, because > FluentIterable.of(null) will throw an NPE before that. This part > should be extracted as a separate unit test like > FluentIterableTest.ofNull. > {code:java} > try { > FluentIterable.of((Iterable) null).size(); > fail("expecting NullPointerException"); > } catch (final NullPointerException npe) { > // expected > }{code} > I'll create a pull request for this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [commons-collections] Prodigysov commented on issue #120: [COLLECTIONS-737] The test FluentIterableTest.size should be splitted
Prodigysov commented on issue #120: [COLLECTIONS-737] The test FluentIterableTest.size should be splitted URL: https://github.com/apache/commons-collections/pull/120#issuecomment-581154786 Sounds good, thanks for the review! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (NUMBERS-70) Userguide and reports
[ https://issues.apache.org/jira/browse/NUMBERS-70?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028479#comment-17028479 ] Kaaira Gupta commented on NUMBERS-70: - Hey [~erans] !! Is this up for Outreachy summers 2020 as well? > Userguide and reports > - > > Key: NUMBERS-70 > URL: https://issues.apache.org/jira/browse/NUMBERS-70 > Project: Commons Numbers > Issue Type: Wish >Reporter: Gilles Sadowski >Priority: Minor > Labels: benchmark, documentation, gsoc2020 > Attachments: 0001-Angles-xdoc-is-added.patch, > 0001-Prime-xdoc-file-is-added.patch, 0001-Primes-xdoc-is-added.patch > > > Review contents of the component towards providing an up-to-date userguide > and write benchmarking code for generating performance reports > ([JMH|http://openjdk.java.net/projects/code-tools/jmh/]). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (COMPRESS-491) Deflate64CompressorInputStream.read(byte[]) works incorrectly
[ https://issues.apache.org/jira/browse/COMPRESS-491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Bodewig updated COMPRESS-491: Component/s: Archivers > Deflate64CompressorInputStream.read(byte[]) works incorrectly > - > > Key: COMPRESS-491 > URL: https://issues.apache.org/jira/browse/COMPRESS-491 > Project: Commons Compress > Issue Type: Bug > Components: Archivers >Affects Versions: 1.18 >Reporter: Juha Syrjälä >Priority: Major > Fix For: 1.20 > > > read(byte[]) method in > org.apache.commons.compress.compressors.deflate64.Deflate64CompressorInputStream > return incorrectly value 0 sometimes: > https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#read(byte[]) > > Reads some number of bytes from the input stream and stores them into the > > buffer array b. The number of bytes actually read is returned as an > > integer. This method blocks until input data is available, end of file is > > detected, or an exception is thrown. > If the length of b is zero, then no bytes are read and 0 is returned; > otherwise, there is an attempt to read at least one byte. If no byte is > available because the stream is at the end of the file, the value -1 is > returned; otherwise, at least one byte is read and stored into b. > The first byte read is stored into element b[0], the next one into b[1], and > so on. The number of bytes read is, at most, equal to the length of b. Let k > be the number of bytes actually read; these bytes will be stored in elements > b[0] through b[k-1], leaving elements b[k] through b[b.length-1] unaffected. > This means that `read` method can return `0` only when zero length byte array > is passed in. > Otherwise read must block until there is at least 1 byte of data available, > or return -1 for end of stream. > Currently in commons-compress 1.18, class > org.apache.commons.compress.compressors.deflate64.Deflate64CompressorInputStream > returns `0` for some buffer sizes and some Zip files. > See [https://github.com/jsyrjala/apache-commons-compress-bug] for test case -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NUMBERS-143) Investigate Math.hypot for computing the absolute of a complex number
[ https://issues.apache.org/jira/browse/NUMBERS-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028450#comment-17028450 ] Alex Herbert commented on NUMBERS-143: -- Here are the definitions of what hypot is required to do from ISO C99: 7.12.7.3 The hypot functions compute the square root of the sum of the squares of x and y, without undue overflow or underflow. A range error may occur. F.9.4.3 Special cases: * hypot(x, y), hypot(y, x), and hypot(x, −y) are equivalent. * hypot(x, ±0) is equivalent to |x|. * hypot(±∞, y) returns +∞, even if y is a NaN. There is no requirement for a set level of accuracy. The fdlibm implementation of hypot specifies it achieves <1 ULP from the exact result. Thus an implementation should handle scaling of the intermediate sum {{x^2+y^2}} to avoid over/underflow, be commutative and handle a special case of zero or infinite. Here is the pseudocode for what fdlibm does: {noformat} // Get exponents of input x and y // Get the absolutes: |x| and |y| // Sort |x| and |y| by magnitude // If the exponent difference is large: return |x| (this may be inf/nan) // If the exponent is large then: // If |x| is inf/nan then: return inf/nan (depends on |y|) // else: scale down // If the exponent is small then: // If |y| is zero then: return |x| // else: scale up // Compute sqrt(x*x + y*y) // Scale back the result // return result {noformat} The code uses the exponent to perform magnitude checks and handle the special cases of inf/nan without ever requiring explicit identification of inf/nan. There are variations on this to do the scaling and inf/nan checks. A key point is that the computation of {{x^2+y^2}} could be done using any method. The hypot function aims for high precision using a condition based on the relative magnitude of |x| and |y|: {code:java} final double w = x - y; if (w > y) { // x > 2y // Extended precision computation of x^2, standard y^2 } else { // 2y > x > y // Extended precision computation of x^2 + y^2 } {code} The point here is that when y is a similar scale to x then any round-off of y^2 is as important as round-off of x^2. This branch is a key part of the performance of the method. For any uniformly distributed polar coordinate (magnitude, angle) input data, X and Y are the edges of a right angle triangle with the magnitude as the hypotenuse. If |x| and |y| can take any value then they will be within a 2-fold factor when the ratio of the edges of the triangle are between 2/1 and 1/2. The angle between them is arctan(2/1) - arctan(1/2) = arctan(3/4) (by trigonomic identity). This forms a segment in the quarter circle area of arctan(3/4) / (pi/2) = 0.41. Thus the branch to (2y > x > y) will be taken 41% of the time. If data is generated using a uniform distribution of x and y then the input data is a square (with one vertex the origin (0,0)) containing point (x,y). The branch (2y > x > y) will be taken 50% of the time. This follows from the lines (2,1) and (1,2) creating segments in a 2x2 square. The area not within the arc is (2*2 - 2*1/2 - 2*1/2) / 4 = 2 / 4. (A diagram would have helped here.) The result being that for randomly simulated data that is uniform in polar or Cartesian coordinates this branch is taken around half the time. This is hard to predict for any input data and the processor cannot efficiently pipeline this computation. This will be demonstrated in results from JMH performance benchmarks. The following candidate methods will be tested to compute {{x^2+y^2}}: * fdlibm computation (involves a branch) * simple x*x + y*y * fused multiply add: Math.fma(x, x, y*y) * extended precision x*x + y*y summation using Dekker's method then standard sqrt Ideally the reference would be computed using 128-bit precision. This can be done using Java 9 BigDecimal which has a sqrt() function. An alternative is extended precision x*x + y*y summation and extended precision sqrt using Dekker's method [1] to set a reference value. Results from an accuracy test and performance test will be added to this ticket. 1. [Dekker (1971) A floating-point technique for extending the available precision|https://doi.org/10.1007/BF01397083] > Investigate Math.hypot for computing the absolute of a complex number > - > > Key: NUMBERS-143 > URL: https://issues.apache.org/jira/browse/NUMBERS-143 > Project: Commons Numbers > Issue Type: Task > Components: complex >Reporter: Alex Herbert >Priority: Minor > > {{Math.hypot}} computes the value {{sqrt(x^2+y^2)}} to within 1 ULP. The > function uses the [e_hypot.c|https://www.netlib.org/fdlibm/e_hypot.c] > implementation from the Freely Distributable Math Library (fdlibm). > Pre-java 9 this function used JNI to call an external implementation. The
[jira] [Commented] (COMPRESS-502) Allow to disable closing files in the finalizer of ZipFile
[ https://issues.apache.org/jira/browse/COMPRESS-502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028444#comment-17028444 ] Stefan Bodewig commented on COMPRESS-502: - Sounds as if both [~ggregory] and I could live with either a constructor-arg or an instance setter. > Allow to disable closing files in the finalizer of ZipFile > -- > > Key: COMPRESS-502 > URL: https://issues.apache.org/jira/browse/COMPRESS-502 > Project: Commons Compress > Issue Type: Improvement > Components: Compressors >Affects Versions: 1.19 >Reporter: Dominik Stadler >Priority: Major > > Apache POI uses commons-compress for handling ZipFiles. We found that it > sometimes does some auto-close magic in the finalizer of the ZipFile class > with printing out to stderr, see > https://gitbox.apache.org/repos/asf?p=commons-compress.git;a=blob;f=src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java;h=23194560dace91d8052626f3bdc8f765f9c46d7e;hb=HEAD#l652. > > This has some shortcomings: > * It prints to stderr which many large-scale applications try to avoid by > using some logging framework, thus this output might "vanish" unseen in some > installations or might cause unexpected side-effects > * It prevents us from using tools for checking file leaks, e.g. we use > [https://github.com/kohsuke/file-leak-detector/] heavily for analyzing > test-runs for leaked file-handles, but commons-compress prevents this because > it "hides" the unclosed file from this functionality > * The behavior of automatic closing and reporting the problem is > non-reproducible because it depends on finalizer/garbage-collection and thus > any re-runs or unit-tests usually do not show the same behavior > > There are some fairly simple options to improve this: > * Allow to disable this functionality via configuration/system-property/... > * Make this "pluggable" so a logging-framework can be plugged-in or closing > can be prevented for certain runs > > I can provide a simple patch if you state which approach you think would make > most sense here. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [commons-compress] bodewig commented on issue #90: Compress-477 : add zip64 support for split zip
bodewig commented on issue #90: Compress-477 : add zip64 support for split zip URL: https://github.com/apache/commons-compress/pull/90#issuecomment-581135612 This is not really a new issue introduced with the split zips feature, `MultiReadOnlySeekableByteChannel` has been keeping channels open before as well. Honestly I haven't got any idea of how much "jumping around" reading split archives (zip or 7z) actually involves. In both cases we read the channels containing file meta data once and will likely never go back to them. So a small number of open channels may be sufficient. OTOH I expect most split archives to only have a pretty small number of parts, an archive with more than 100 parts is something that I don't expect to be common. Therefore making sure we don't open too many streams is probably not that important at all. I realize I'm not answering your question :-) I'd make the number of simultaneously opened streams configurable. "infinite" might even be a good default so that people only need to deal with it explicitly if they run into trouble. But in real life situations I'd expect any number bigger than say 20 to have the same effect as infinity (i.e. all channels are open as there are no more than 20 channels anyway). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (COMPRESS-494) ZipArchieveInputStream component is throwing "Invalid Entry Size"
[ https://issues.apache.org/jira/browse/COMPRESS-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Bodewig resolved COMPRESS-494. - Resolution: Not A Bug As I explained in my last comment here and later in COMPRESS-500 there are certain archives that are impossible to read with {{ZipArchiveInputStream}}. We have documented this and by now throw a helpful exception (thanks to COMPRESS-483) if such a file is encountered. Unfortunately you are facing such an archive and there is no workaround. > ZipArchieveInputStream component is throwing "Invalid Entry Size" > - > > Key: COMPRESS-494 > URL: https://issues.apache.org/jira/browse/COMPRESS-494 > Project: Commons Compress > Issue Type: Bug >Affects Versions: 1.8, 1.18 >Reporter: Anvesh Mora >Priority: Critical > Attachments: commons-compress-1.20-SNAPSHOT.jar > > > I've observed in my development in certain zip files which we are able to > extract with with unzip utility on linux is failing with our Compress library. > > As of now I've stack-trace to share, I'm gonna add more in here as on when > discussion begins on this: > > {code:java} > Caused by: java.lang.IllegalArgumentException: invalid entry size > at > org.apache.commons.compress.archivers.zip.ZipArchiveEntry.setSize(ZipArchiveEntry.java:550) > at > org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readDataDescriptor(ZipArchiveI > nputStream.java:702) > at > org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.bufferContainsSignature(ZipArc > hiveInputStream.java:805) > at > org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readStoredEntry(ZipArchiveInpu > tStream.java:758) > at > org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readStored(ZipArchiveInputStre > am.java:407) > at > org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.jav > a:382) > {code} > I missed to add version info, below are those: > version of lib I'm using is: 1.9 > And I did try version 1.18, issue is observed in this version too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (COMPRESS-500) Discrepancy in file size extracted using ZipArchieveInputStream and Gzip decompress component
[ https://issues.apache.org/jira/browse/COMPRESS-500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Bodewig resolved COMPRESS-500. - Resolution: Not A Bug > Discrepancy in file size extracted using ZipArchieveInputStream and Gzip > decompress component > -- > > Key: COMPRESS-500 > URL: https://issues.apache.org/jira/browse/COMPRESS-500 > Project: Commons Compress > Issue Type: Bug > Components: Compressors >Affects Versions: 1.8, 1.18 >Reporter: Anvesh Mora >Priority: Major > Attachments: Compress500.java, invalidzip.zip.partaa, > invalidzip.zip.partab, invalidzip.zip.partac, invalidzip.zip.partad, > invalidzip.zip.partae, invalidzip.zip.partaf, invalidzip.zip.partag, > invalidzip.zip.partah, invalidzip.zip.partai > > > Recent time I raised a bug facing a issue of "invalid Entry Size" > COMPRESS-494 ( Not resolved yet). > > And we are seeing a new issue, before explaining we have a file structure as > below and it is received as a stream of data over HTTPS. > > *File Structure*: > In Zip file > We have zero or more gz files which need to decompressed > And meta data at the end of the zip entries (end of stream), used for > downloading next file zip file. As plain text. > > And Now in production we are seeing new issue where we the entire gz file is > not decompressing. We found out that the utility on Cent OS7 is able to > extract and decompress the entire where as our library is failing. Below are > the differences in Sizes: > Using API: *765460480* bytes > And using Cent OS7 Linux utilities: *2032925215* bytes. > > We are getting EOF File exception at GzipCompressorInputStream.java:278, I'm > not sure of why. > > Need you help on this as we are blocked in the production. This could be a > potential fix for our library to make it more robust. > > Let me know HOW CAN WE INCREASE THE PRIORITY IF NEEDED! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (COMPRESS-500) Discrepancy in file size extracted using ZipArchieveInputStream and Gzip decompress component
[ https://issues.apache.org/jira/browse/COMPRESS-500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Bodewig updated COMPRESS-500: Attachment: Compress500.java > Discrepancy in file size extracted using ZipArchieveInputStream and Gzip > decompress component > -- > > Key: COMPRESS-500 > URL: https://issues.apache.org/jira/browse/COMPRESS-500 > Project: Commons Compress > Issue Type: Bug > Components: Compressors >Affects Versions: 1.8, 1.18 >Reporter: Anvesh Mora >Priority: Major > Attachments: Compress500.java, invalidzip.zip.partaa, > invalidzip.zip.partab, invalidzip.zip.partac, invalidzip.zip.partad, > invalidzip.zip.partae, invalidzip.zip.partaf, invalidzip.zip.partag, > invalidzip.zip.partah, invalidzip.zip.partai > > > Recent time I raised a bug facing a issue of "invalid Entry Size" > COMPRESS-494 ( Not resolved yet). > > And we are seeing a new issue, before explaining we have a file structure as > below and it is received as a stream of data over HTTPS. > > *File Structure*: > In Zip file > We have zero or more gz files which need to decompressed > And meta data at the end of the zip entries (end of stream), used for > downloading next file zip file. As plain text. > > And Now in production we are seeing new issue where we the entire gz file is > not decompressing. We found out that the utility on Cent OS7 is able to > extract and decompress the entire where as our library is failing. Below are > the differences in Sizes: > Using API: *765460480* bytes > And using Cent OS7 Linux utilities: *2032925215* bytes. > > We are getting EOF File exception at GzipCompressorInputStream.java:278, I'm > not sure of why. > > Need you help on this as we are blocked in the production. This could be a > potential fix for our library to make it more robust. > > Let me know HOW CAN WE INCREASE THE PRIORITY IF NEEDED! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (COMPRESS-500) Discrepancy in file size extracted using ZipArchieveInputStream and Gzip decompress component
[ https://issues.apache.org/jira/browse/COMPRESS-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028427#comment-17028427 ] Stefan Bodewig commented on COMPRESS-500: - I've written a small test program and will attach it. Running it you see: * things work as expected when using ZipFile * things don't work at all and you get an exception when using ZipArchiveInputStream without explicitly allowing the combination of data descriptors and stord entries. This has been true in 1.8 and remains true * when allowing the combination Compress 1.19 and later throws an Exception {code:java} java.util.zip.ZipException: compressed and uncompressed size don't match while reading a stored entry using data descriptor. Either the archive is broken or it can not be read using ZipArchiveInputStream and you must use ZipFile. A common cause for this is a ZIP archive containing a ZIP archive. See http://commons.apache.org/proper/commons-compress/zip.html#ZipArchiveInputStream_vs_ZipFile {code} So unfortunately what I suspected is true. You are looking at the kind of archive that is not possible to extract using {{ZipArchiveInputStream}} and there is no workaround for it. If you create the archive yourself, either ensure you don't use a data descriptor and store the size information inside of the local file header or use the DEFLATED method, as wasteful as it may seem. If you do not control the original archive, then you must store it to disk or keep it in memory (see {{SeekableInMemoryByteChannel}}) and use {{ZipFile}}. > Discrepancy in file size extracted using ZipArchieveInputStream and Gzip > decompress component > -- > > Key: COMPRESS-500 > URL: https://issues.apache.org/jira/browse/COMPRESS-500 > Project: Commons Compress > Issue Type: Bug > Components: Compressors >Affects Versions: 1.8, 1.18 >Reporter: Anvesh Mora >Priority: Major > Attachments: invalidzip.zip.partaa, invalidzip.zip.partab, > invalidzip.zip.partac, invalidzip.zip.partad, invalidzip.zip.partae, > invalidzip.zip.partaf, invalidzip.zip.partag, invalidzip.zip.partah, > invalidzip.zip.partai > > > Recent time I raised a bug facing a issue of "invalid Entry Size" > COMPRESS-494 ( Not resolved yet). > > And we are seeing a new issue, before explaining we have a file structure as > below and it is received as a stream of data over HTTPS. > > *File Structure*: > In Zip file > We have zero or more gz files which need to decompressed > And meta data at the end of the zip entries (end of stream), used for > downloading next file zip file. As plain text. > > And Now in production we are seeing new issue where we the entire gz file is > not decompressing. We found out that the utility on Cent OS7 is able to > extract and decompress the entire where as our library is failing. Below are > the differences in Sizes: > Using API: *765460480* bytes > And using Cent OS7 Linux utilities: *2032925215* bytes. > > We are getting EOF File exception at GzipCompressorInputStream.java:278, I'm > not sure of why. > > Need you help on this as we are blocked in the production. This could be a > potential fix for our library to make it more robust. > > Let me know HOW CAN WE INCREASE THE PRIORITY IF NEEDED! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JEXL-323) Ant-style variables can throw exception when evaluated for their value
[ https://issues.apache.org/jira/browse/JEXL-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028425#comment-17028425 ] Dmitri Blinov commented on JEXL-323: Not sure whether its worth creating a separate issue for this, but the following test case fails, because we are mixing null values with undefined variables when resolving antish variables. {code:java} @Test public void testBadAnt() throws Exception { JexlEvalContext ctxt = new JexlEvalContext(); JexlOptions options = ctxt.getEngineOptions(); ctxt.set("x.y", 42); JexlScript script = JEXL.createScript("var x = null; x.y"); try { Object result = script.execute(ctxt); Assert.fail("antish var shall not be resolved"); } catch(JexlException xother) { Assert.assertTrue(xother != null); } } {code} > Ant-style variables can throw exception when evaluated for their value > -- > > Key: JEXL-323 > URL: https://issues.apache.org/jira/browse/JEXL-323 > Project: Commons JEXL > Issue Type: Bug >Affects Versions: 3.1 >Reporter: David Costanzo >Assignee: Henri Biestro >Priority: Minor > Fix For: 3.2 > > > When try to evaluate an expression that is the name of a variable and the > value is null, I get the value null. This is good. However, when I do the > same thing with an ant-style variable name, a JexlException$Variable is > thrown claiming that the variable is null. I think this is a bug because I > would expect all variables to behave the same, regardless of their name. > The reason for this behavior is evident in Interpreter.visit() and > InterpreterBase.unsolvableVariable(). There is already special-case logic to > detect when an ant variable is null versus when it's undefined, and this > information is given to unsolvableVariable(), but it still throws an > exception. > > {code:java} > if (object == null && !isTernaryProtected(node)) { > if (antish && ant != null) { > // V--- NOTE: context.has() returns true, so undefined is false > boolean undefined = !(context.has(ant.toString()) || > isLocalVariable(node, 0)); > // variable unknown in context and not a local > return unsolvableVariable(node, ant.toString(), undefined); // > <-- still throws exception > } else if (!pty) { > return unsolvableProperty(node, ".", null); > } > } > {code} > In in unsolvableVariable(): > > {code:java} > protected Object unsolvableVariable(JexlNode node, String var, boolean undef) > { > // V-- NOTE: both my engine and arithmetic are strict, so this evaluates > to true > if (isStrictEngine() && (undef || arithmetic.isStrict())) { > throw new JexlException.Variable(node, var, undef); > } else if (logger.isDebugEnabled()) { > logger.debug(JexlException.variableError(node, var, undef)); > } > return null; > } > {code} > > > h3. Steps to Reproduce: > > {code:java} > @Test > public void testNullAntVariable() throws IOException { > // Create or retrieve an engine > JexlEngine jexl = new JexlBuilder().create(); > // on recent code: JexlEngine jexl = new > JexlBuilder().safe(false).create(); > // Populate to identical global variables > JexlContext jc = new MapContext(); > jc.set("NormalVariable", null); > jc.set("ant.variable", null); > // Evaluate the value of the normal variable > JexlExpression expression1 = jexl.createExpression("NormalVariable"); > Object o1 = expression1.evaluate(jc); > Assert.assertEquals(null, o1); > // Evaluate the value of the ant-style variable > JexlExpression expression2 = jexl.createExpression("ant.variable"); > Object o2 = expression2.evaluate(jc); // <-- BUG: throws exception > instead of returning null > Assert.assertEquals(null, o2); > } > {code} > > > h3. What Happens: > "expression2.evaluate(jc)" throws an JexlException$Variable exception with > text like "variable 'ant.variable' is null". > h3. Expected Result: > expression2.evaluate(jc) returns the value of 'ant.variable', which is null. > h3. > Note: > This was found on JEXL 3.1, the latest official release. I reproduced it on a > snapshot of JEXL 3.2 built from github source, but had to disable "safe". > h3. > Impact: > My organization uses JEXL to build datasets for clinical trials. In our > domain, it's very common to have an expression that is simply the name of a > variable whose value is desired. In our domain, we want any sloppy > expressions to be a hard error, so we we use strict engines and will use > "safe=false" when we update to JEXL 3.2. In our domain, "null" has a specific > meaning (it means "missing"). A
[jira] [Created] (NUMBERS-143) Investigate Math.hypot for computing the absolute of a complex number
Alex Herbert created NUMBERS-143: Summary: Investigate Math.hypot for computing the absolute of a complex number Key: NUMBERS-143 URL: https://issues.apache.org/jira/browse/NUMBERS-143 Project: Commons Numbers Issue Type: Task Components: complex Reporter: Alex Herbert {{Math.hypot}} computes the value {{sqrt(x^2+y^2)}} to within 1 ULP. The function uses the [e_hypot.c|https://www.netlib.org/fdlibm/e_hypot.c] implementation from the Freely Distributable Math Library (fdlibm). Pre-java 9 this function used JNI to call an external implementation. The performance was slow. Java 9 ported the function to native java (see [JDK-7130085 : Port fdlibm hypot to Java|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=7130085]). This function is used to define the absolute value of a complex number. It is also used in sqrt() and log(). This ticket is to investigate the performance and accuracy of \{{Math.hypot}} against alternatives for use in Complex. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEOMETRY-50) Overflow in Vector norm and distance
[ https://issues.apache.org/jira/browse/GEOMETRY-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028420#comment-17028420 ] Alex Herbert commented on GEOMETRY-50: -- Checking each value for over/underflow is costly. Even if the branch practically never occurs and branch prediction can learn to ignore the 'safe' branches there is still a cost to having to ensure those branches are not needed. You could run this with a size of 10 and see if the SafeNorm relatively improves. At that data size the branch prediction will be much better. Also note that SafeNorm is written for arrays of any length. It could be unrolled for length 2 and 3. That may be an interesting addition to the benchmark. {{Math.hypot}} ensures accuracy below 1 ULP. It does this with careful computation of the {{x^2+y^2}} to minimise round-off. This may not be needed here. However my tests show that this computation is not the only factor in performance, the number of unpredictable branches in the code play a key role. I am going to put some analysis results on {{Math.hypot}} under a ticket for numbers. I'll link to this issue for reference. > Overflow in Vector norm and distance > > > Key: GEOMETRY-50 > URL: https://issues.apache.org/jira/browse/GEOMETRY-50 > Project: Apache Commons Geometry > Issue Type: Bug >Reporter: Baljit Singh >Priority: Major > > In Euclidean Vector classes (Vector2D, Vector3D), norm() and distance() rely > on Math.sqrt(), which can overflow if the components of the vectors are > large. Instead, they should rely on SafeNorm. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NUMBERS-142) Improve LinearCombination accuracy during summation of the round-off errors
[ https://issues.apache.org/jira/browse/NUMBERS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028370#comment-17028370 ] Gilles Sadowski commented on NUMBERS-142: - bq. retain the use of primitives for performance and convenience. Sure; it is necessary even if just to quantify the impact (of using objects) on performance. bq. add alternative implementations that use Field. That would be the proposal. I was wondering whether this task could become a GSoC project. > Improve LinearCombination accuracy during summation of the round-off errors > --- > > Key: NUMBERS-142 > URL: https://issues.apache.org/jira/browse/NUMBERS-142 > Project: Commons Numbers > Issue Type: Improvement > Components: arrays >Affects Versions: 1.0 >Reporter: Alex Herbert >Assignee: Alex Herbert >Priority: Minor > Attachments: array_performance.jpg, cond_no.jpg, > error_vs_condition_no.jpg, inline_perfomance.jpg > > > The algorithm in LinearCombination is an implementation of dot2 from [Ogita > el al|http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547] > (algorithm 5.3). There is a subtle difference in that the original dot2 > algorithm sums the round-off from the products and the round-off from the > product summation together. The method in LinearCombination sums them > separately (using an extra variable) and adds them at the end. This actually > improves the accuracy under conditions where the round-off is of greater sum > than the products, as described below. > The dot2 algorithm suffers when the total sum is close to 0 but the > intermediate sum is far enough from zero that there is a large difference > between the exponents of summed terms and the final result. In this case the > sum of the round-off is more important than the sum of the products which due > to massive cancellation is zero. The case is important for Complex numbers > which require a computation of log1p(x^2+y^2-1) when x^2+y^2 is close to 1 > such that log(1) would be ~zero yet the true logarithm is representable to > very high precision. > This can be protected against by using the dotK algorithm of Ogita et al with > K>2. This saves all the round-off parts from each product and the running > sum. These are subject to an error free transform that repeatedly adds > adjacent pairs to generate a new split pair with a closer upper and lower > part. Over time this will order the parts from low to high and these can be > summed low first for an error free dot product. > Using this algorithm with a single pass (K=3 for dot3) removes the > cancellation error observed in the mentioned use case. Adding a single pass > over the parts changes the algorithm from 25n floating point operations > (flops) to 31n flops for the sum of n products. > A second change for the algorithm is to switch to using > [Dekker's|https://doi.org/10.1007/BF01397083] algorithm (Dekker, 1971) to > split the number. This extracts two 26-bit mantissas from a 53-bit mantis (in > IEEE 754 the leading bit in front of the of the 52-bit mantissa is assumed to > be 1). This is done by multiplication by 2^s+1 with s = ceil(53/2) = 27: > big = (2^s+1) * a > a_hi = (big - (big - a)) > The extra bit of precision is carried into the sign of the low part of the > split number. > This is in contrast to the method in LinearCombination that uses a simple > mask on the long representation to obtain the a_hi part in 26-bits and the > lower part will be 27 bits. > The advantage of Dekker's method is it produces 2 parts with 26 bits in the > mantissa that can be multiplied exactly. The disadvantage is the potential > for overflow requiring a branch condition to check for extreme values. > It also appropriately handles very small sub-normal numbers that would be > masked to create a 0 high part with all the non-zero bits left in the low > part using the current method. This will have knock on effects on split > multiplication which requires the high part to be larger. > A simple change to the current implementation to use Dekker's split improves > the precision on a wide range of test data (not shown). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEOMETRY-50) Overflow in Vector norm and distance
[ https://issues.apache.org/jira/browse/GEOMETRY-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028366#comment-17028366 ] Gilles Sadowski commented on GEOMETRY-50: - In the the context of "Commons Geometry" usage, are there examples where {{Math.hypot}} is required? If not, a common implementation of the norm computation would avoid some future puzzling about the performance issue. bq. {{SafeNorm}} [is] a modest performance hit in 3D. (5026-4191)/(12742-4111) = 0.0967 So, if IIUC the above table, {{SafeNorm}} is ~10 times slower. > Overflow in Vector norm and distance > > > Key: GEOMETRY-50 > URL: https://issues.apache.org/jira/browse/GEOMETRY-50 > Project: Apache Commons Geometry > Issue Type: Bug >Reporter: Baljit Singh >Priority: Major > > In Euclidean Vector classes (Vector2D, Vector3D), norm() and distance() rely > on Math.sqrt(), which can overflow if the components of the vectors are > large. Instead, they should rely on SafeNorm. -- This message was sent by Atlassian Jira (v8.3.4#803005)