issues@commons.apache.org
[ https://issues.apache.org/jira/browse/LANG-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061027#comment-13061027 ] Henri Yandell commented on LANG-710: So the basic issue imo is that ParseException is a typed exception - we'd have to introduce it to the StringEscapeUtils API. I'm uncomfortable throwing a random IllegalArgumentException (or similar) when the bad data is passed in. That may be the typed-exception fan in me speaking. I don't like discovering at 4am that someone found a piece of data that caused a heretofore unknown runtime exception to occur. So we have three options: 1: Leave the data unescaped because it is poorly typed. 2: Claim that we're dealing with XHTML and throw an exception. 3: Escape the data. All the options seem useful, but none of them seem perfect. So I've implemented all three. svn ci -m "Making unescapeHtml _NOT_ escape unfinished numeric entities by default (it ignores them); however adding options that will fire an exception or unescape the numeric entity. LANG-710" Sending src/main/java/org/apache/commons/lang3/text/translate/NumericEntityUnescaper.java Sending src/test/java/org/apache/commons/lang3/text/translate/NumericEntityUnescaperTest.java Transmitting file data .. Committed revision 1143641. > StringIndexOutOfBoundsException when calling unescapeHtml4("") > -- > > Key: LANG-710 > URL: https://issues.apache.org/jira/browse/LANG-710 > Project: Commons Lang > Issue Type: Bug > Components: lang.* >Affects Versions: 3.0 > Environment: java version "1.6.0_24" > Java(TM) SE Runtime Environment (build 1.6.0_24-b07) > Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) >Reporter: Benjamin Valentin >Assignee: Henri Yandell >Priority: Minor > Labels: StringEscapeUtils, StringUtils > Fix For: 3.0 > > > When calling unescapeHtml4() on the String "" (or any String that > contains these characters) an Exception is thrown: > Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String > index out of range: 4 > at java.lang.String.charAt(String.java:686) > at > org.apache.commons.lang3.text.translate.NumericEntityUnescaper.translate(NumericEntityUnescaper.java:49) > at > org.apache.commons.lang3.text.translate.AggregateTranslator.translate(AggregateTranslator.java:53) > at > org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(CharSequenceTranslator.java:88) > at > org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(CharSequenceTranslator.java:60) > at > org.apache.commons.lang3.StringEscapeUtils.unescapeHtml4(StringEscapeUtils.java:351) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JEXL-113) Dot notation behaves unexpectedly with null values
[ https://issues.apache.org/jira/browse/JEXL-113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060990#comment-13060990 ] Henri Biestro commented on JEXL-113: Hi Max, Not sure which way to go: 1/ allow an option to prevent the 'dot' operator: all variables are 'antish' and array access is needed to get to properties? 2/ allow an option to prevent the 'antish' variables; no variable can be 'antish', the 'dot' operator always accesses a property? 3/ another solution would be to white-list classes / properties to restrict which ones can participate in the 'dot'/'array-reference' resolution Any opinion, preferred choice ? Cheers Henrib > Dot notation behaves unexpectedly with null values > -- > > Key: JEXL-113 > URL: https://issues.apache.org/jira/browse/JEXL-113 > Project: Commons JEXL > Issue Type: Bug >Affects Versions: 2.0.1 > Environment: JDK 1.6 >Reporter: Max Tardiveau > > When a variable of the form a.b is evaluated, the context is asked first for > the value of a. That value is then asked for the value of b. > So far, so good: this is exactly what you'd expect from the dot operator. > But if the value of b is null, the context is then asked for the value of > a.b, in other words the dot operator is ignored and "a.b" is considered to be > a single variable. > This is at best confusing. Granted, this can be avoided with the a['b'] > notation, but that's clumsy. > I assume this is an attempt to support both the dot operator and ant-style > variables. I don't think you can have both and remain sane. > Suggestion: either document this behavior, or make it an option. My vote > would be to just use the value returned, even if it's null. Either dot is an > operator, or it's not. Perhaps make that configurable? > Thanks! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FILEUPLOAD-193) FileNotFoundException thrown by DiskFileItem.write
[ https://issues.apache.org/jira/browse/FILEUPLOAD-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060897#comment-13060897 ] Dan Washusen commented on FILEUPLOAD-193: - I haven't noticed this issue since reporting it (no changes to my code). Jagub, to answer your questions... No I'm not calling DiskFileItem#write twice. I call this method while iterating over the list of FileItems returned from ServletFileUpload#parseRequest. Each request contains several form fields and a single file. Yes, I have a FileCleaningTracker registered with the DiskFileItemFactory. However, I call the write method almost immediately after calling ServletFileUpload#parseRequest (so I wouldn't have thought the FileCleaningTracker would have had a chance to do anything yet). > FileNotFoundException thrown by DiskFileItem.write > -- > > Key: FILEUPLOAD-193 > URL: https://issues.apache.org/jira/browse/FILEUPLOAD-193 > Project: Commons FileUpload > Issue Type: Bug >Affects Versions: 1.2.2 > Environment: Ubuntu 10.10 > java version "1.6.0_24" > Java(TM) SE Runtime Environment (build 1.6.0_24-b07) > Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing) >Reporter: Dan Washusen >Priority: Critical > > Under certain conditions the DiskFileItem.write throws a FileNotFound > exception. It seems to be when outputFile.renameTo(file) fails. > {code}java.io.FileNotFoundException: > /tmp/UploadController/uploading/upload_69651d04_13000a31964__8000_1651.tmp > (No such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:106) > at > org.apache.commons.fileupload.disk.DiskFileItem.write(DiskFileItem.java:447) > at upload.UploadController.handle(UploadController.java:90) > ...{code} > I can't see any obvious reason why the source file (outputFile) wouldn't > exist... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] greg sterijevski updated MATH-607: -- Attachment: updating_reg_cut2 Phil, Attached is the patch based on your comments. Please review. -Greg > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_cut2, updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MATH-612) Optimisation for QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer
[ https://issues.apache.org/jira/browse/MATH-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Nix updated MATH-612: - Description: QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer all contain methods that create an empty matrix for population in calculations employing getEntry and setEntry on the matrix. Methods getEntry and setEntry perform a check to ensure the matrix indices are in bounds. This overhead of method calls is detrimental within loops that iterate many times. Methods within QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer have significantly improved performance over large matrices if, instead of creating an empty RealMatrix and then using getEntry and setEntry, we create a double array for direct access and create a RealMatrix from it at the end. was: QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer all contain methods that create an empty matrix for population in calculations employing getEntry and setEntry on the matrix. Methods getEntry and setEntry perform a check to ensure the matrix indices are in bounds. This check is not necessary if the calling method has already ensured them to be in bounds, for example by checking the max and min index as a looping parameter. Methods within QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer have significantly improved performance over large matrices if, instead of creating an empty RealMatrix and then using getEntry and setEntry, we create a double array for direct access and create a RealMatrix from it at the end. > Optimisation for QRDecomposition, BiDiagonalTransformer and > TriDiagonalTransformer > -- > > Key: MATH-612 > URL: https://issues.apache.org/jira/browse/MATH-612 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0, Nightly Builds >Reporter: Christopher Nix >Priority: Minor > Labels: patch > Fix For: 3.0, Nightly Builds > > Attachments: BiDiagonalTransformer.patch, QRDecompositionImpl.patch, > TriDiagonalTransformer.patch > > > QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer all contain > methods that create an empty matrix for population in calculations employing > getEntry and setEntry on the matrix. Methods getEntry and setEntry perform a > check to ensure the matrix indices are in bounds. This overhead of method > calls is detrimental within loops that iterate many times. > Methods within QRDecomposition, BiDiagonalTransformer and > TriDiagonalTransformer have significantly improved performance over large > matrices if, instead of creating an empty RealMatrix and then using getEntry > and setEntry, we create a double array for direct access and create a > RealMatrix from it at the end. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060814#comment-13060814 ] Phil Steitz commented on MATH-607: -- I did not see the parameter covariance matrix in RegressionResults. I agree with your basic point on this, though. I am less concerned with wanting to add stuff than including things that we either wish we had omitted (e.g. the redundancy stuff as just an example) or typed or constrained differently. How about starting with a minimalist concrete class and once we have the interface stabilized, we can peel it off and keep the class for persisting / serializing results. Sorry to flip/flop, but looking carefully at the UpdatingLinearRegression interface again, I think it is fine to just add it as an interface. I would suggest the s/data/observation change in my last comment though and maybe renaming it to UpdatingMultipleLinearRegression. > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COMPRESS-136) Create common POSIXArchiveEntry for use with tar/cpio/dump
[ https://issues.apache.org/jira/browse/COMPRESS-136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060813#comment-13060813 ] Bear Giles commented on COMPRESS-136: - Implementation note: Java 7 will add two packages with similar content. See http://download.java.net/jdk7/docs/api/index.html?java/nio/file/Files.html More info: http://java.sun.com/developer/technicalArticles/javase/nio/ Obviously the CC classes can't use J7 classes but it could useful as a model for method names and the like. > Create common POSIXArchiveEntry for use with tar/cpio/dump > -- > > Key: COMPRESS-136 > URL: https://issues.apache.org/jira/browse/COMPRESS-136 > Project: Commons Compress > Issue Type: Improvement > Components: Archivers >Reporter: Bear Giles >Priority: Minor > > There's currently a top-level ArchiveEntry but many archivers are for POSIX > (or POSIX-like) filesystems with standard attributes - ownership, > permissions, access times, etc. By creating a common ArchiveEntry class for > these attributes it will be much easier to write archive-agnostic extraction > software. (Many of the existing methods should be marked 'deprecated' but > left in place indefinitely.) > I have a prototype of this class in my dump patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060800#comment-13060800 ] greg sterijevski commented on MATH-607: --- On the results object: There are vars *( vars + 1 ) /2 elements in the cov matrix, vars int parameters, vars int standard errors and a some other assorted stuff. Not terribly large at first. However, consider doing panel regression via dummy variables, the covariance matrix can get fast very quickly. That being said, I don't think making RegressionResults a concrete class is a gamestopper. Should I send a follow up patch with results made concrete? On the regression object: Are you concerned that we will be removing methods from any interface we specify today? Or do you think the contract is too restrictive? The reason I am pushing for interface is that I have two candidates for concrete implementation of updating regression. The first implementation is based on Gentleman's lemma and is detailed in the following article: Algorithm AS 274: Least Squares Routines to Supplement those of Gentleman Author: Alan J Miller Source Journal of the Royal Statistical Society Vol 41 No 2 (1992) The second approach is one detailed by this article by Goodnight: A Tutorial on the SWEEP Operator James H. Goodnight The American Statistician, Vol. 33, No. 3. (Aug., 1979), pp. 149-158. The first approach never forms the cross products matrix, the second does. They are significantly different approaches to dealing with large data sets. How would I do this in the concrete class you propose? Thanks, -Greg > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060797#comment-13060797 ] Phil Steitz commented on MATH-607: -- Thanks, I forgot to mention that important point. Initially, we took the "take what we are given" approach, but that proved confusing and error-prone for users (forcing them to add unitary columns to input data). I think it is best to expect no unitary columns in the design matrix and have the user explicitly specify "noIntercept" to estimate a model without an intercept term. This is how the MultipleLinearRegression impls now work. (See the javadoc for newSampleData in AbstractMultipleLinearRegression). In the updating impls, this can work the same way, allowing users to omit initial "1"s from added rows. I guess this will have to be a constructor parameter to work correctly in the updating impls. Another thing I forgot to mention is careful specification and validation of array shape constraints on input data (i.e., when things have to be rectangular and/or of length = previously determined nVars. I liked the lack of a setter for the number of explanatory variables, but that means the first addData becomes definitional. One final suggestion - maybe the row version of addData should be addObservation and the matrix version should be addObservations. > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060792#comment-13060792 ] greg sterijevski commented on MATH-607: --- One more thing, on the subject of the adjusted R Squared. I am not sure I would include this, since this is dependent on knowledge that a constant exists. I currently envision being handed some data. If the data has a column which is nothing but ones, great. If not, great again. I could not come up with an elegant way to handle constant detection, and therefore a clean way to determine the Busse R squared. I guess we could keep a flag for each regressor. If the regressor has a changed value then we would say it is not a constant. The other approach is to test the residuals for bias-if there is no bias, then constant or not we are okay. Though that would be messy since I do not keep the data around. Either way makes for a bit of unpleasantness that yields very little? > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060791#comment-13060791 ] Phil Steitz commented on MATH-607: -- I get your point on the Results interface. It did not look "large" to me at first (i.e., generally o(vars) vs o(obs)). If it could get "large" it would indeed be better to leave as an interface. The problem there is really nailing it because interfaces are very hard to change. My sense at this point is that we may want to rev this a few times before it is really stable, so a concrete class would be better to start with. Also, having the "value" class is handy. StatisticalSummaryValues is an example of that (which implements the interface that preceded it - so maybe having both is a good longer term solution). If it turns out to be too unwieldy to create the results factory methods, I am OK starting with the interface approach, but in that case we should review it very carefully prior to release. I did not mean to suggest that UpdatingOLSRegression should be an abstract class. If and when a weighted or non-OLS updating regression is implemented, we might consider introducing an abstract parent, but I would need to see good reason for this. IMO, what we have now in OLS, WLS is of marginal value (I mean the abstract superclass and interface). > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060782#comment-13060782 ] greg sterijevski commented on MATH-607: --- Sorry for duplicating part of my response, but gmail has truncated it (maybe google is telling me something about my ideas... ;0 ) My complete response is: I agree on eliminating getRedundant() and isRedundant(int idx). If the underlying solver is QR or Gaussian this info would exist. If the underlying method is SVD, then we would register the rank reduction, but we would not be able to attribute it to a particular column in the design matrix. I am probably in agreement with with making RegressionResults concrete, but there were a couple of considerations which forced me to interface. Say that I begin with the following augmented matrix: | X'X X'Y| | X'YY'Y| where X is the design matrix ( nobs x nreg ), Y is the dependent variable (nobs x 1 ) On a copy of the cross products matrix (the thing above), I get the following via gaussian elimination: | inv(X'X) -beta| | -beta e'e| inv(X'X) is the inverse of the X'X matrix. -beta is the OLS vector of slopes. e'e is the sum of squared errors. Getting most of the info (that RegressionResults surfaces) is simply a matter of indexing. All I need to do in this case is write a wrapper around a symmetric matrix which implements the interface. I suppose that there could be constructor which took the matrix above and did the indexing, but that seems too dirty. Furthermore, there are probably other optimized formats for OLS which have similar aspects. I wanted to keep the door open to other schemes, without making (potentially large) copies of variance matrices, standard errors and so forth a necessity. On the name of the getter for number of observations, I am okay with whatever you feel is a better name. Regarding the model interface, I would again suggest that we just define this as a class, UpdatingOLSRegression. I suppose that if we end up implementing a weighted or other non-OLS version, we might want to factor out a common interface like what exists for MultipleLinearRegression, but in retrospect, I am not sure that interface was worth much. Note that all that we could factor out is essentially what is in MultivariateRegression, which is analogous to your RegressionResults. So you are saying the UpdatingOLSRegression be an abstract class? There are not that many methods in the interface. That would be okay if were sure that subclasses always overrode either the regress(...) methods or the addObservations(...) methods. I worry that you might get have a base class full of nothing but abstract functions. > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060774#comment-13060774 ] greg sterijevski commented on MATH-607: --- Phil, underlying solver is QR or Gaussian this info would exist. If the underlying method is SVD, then we would register the rank reduction, but we would not be able to attribute it to a particular column in the design matrix. I am probably in agreement with with making RegressionResults concrete, but there were a couple of considerations which forced me to interface. Say that I begin with the following augmented matrix: | X'X X'Y| | X'YY'Y| where X is the design matrix ( nobs x nreg ), Y is the dependent variable (nobs x 1 ) On a copy of the cross products matrix (the thing above), I get the following via gaussian elimination: | inv(X'X) -beta| | -beta e'e| inv(X'X) is the inverse of the X'X matrix. -beta is the OLS vector of slopes. e'e is the sum of squared errors. Getting most of the info (that RegressionResults surfaces) is simply a matter of indexing. All I need to do in this case is write a wrapper around a symmetric matrix which implements the interface. I suppose that there could be constructor which took the matrix above and did the indexing, but that seems too dirty. Furthermore, there are probably other optimized formats for OLS which have similar aspects. I wanted to keep the door open to other schemes, without making (potentially large) copies of variance matrices, standard errors and so forth a necessity. On the name of the getter for number of observations, I am okay with whatever you feel is a better name. So you are saying the UpdatingOLSRegression be an abstract class? There are not that many methods in the interface. That would be okay if were sure that subclasses always overrode either the regress(...) methods or the addObservations(...) methods. I worry that you might get have a base class full of nothing but abstract functions. So, modulo the one name change, I propose to just change these to classes > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060744#comment-13060744 ] Phil Steitz edited comment on MATH-607 at 7/6/11 6:31 PM: -- First, thanks for pushing this along and sorry to be slow to respond. I like both of the abstractions, but I am not sure that defining interfaces is the best way to go in either case. The reporting interface (RegressionResults) could be a concrete class and it is probably best to define a base class that omits some of the reported stats (e.g. isRedundant, getRedundant). Making this a class gives us more flexibility. It also makes it a little easier / more convenient for users who want to store off intermediate results. One thing that I would add to either the base or an extended version is adjusted R-square. I think it is also a good idea at this point to ask what else might be missing. Your suggestions on redundancy are a good example. For now, I would suggest making RegressionResults a serializable class as we finalize its contents. One small quibble on naming: s/getNobs/getNumberOfObservations or if that is too onerous getN (similar to other stats). Regarding the model interface, I would again suggest that we just define this as a class, UpdatingOLSRegression. I suppose that if we end up implementing a weighted or other non-OLS version, we might want to factor out a common interface like what exists for MultipleLinearRegression, but in retrospect, I am not sure that interface was worth much. Note that all that we could factor out is essentially what is in MultivariateRegression, which is analogous to your RegressionResults. So, modulo the one name change, I propose to just change these to classes and get going on the implementation. Any other suggestions on what we should add / modify in the RegressionResults? was (Author: psteitz): First, thanks for pushing this along and sorry to be slow to respond. I like both of the abstractions, but I am not sure that defining interfaces is the best way to go in either case. The reporting interface (RegressionResults) could be a concrete class and it is probably best to define a base class that omits some of the reported stats (e.g. isRedundant, getRedundant). Making this a class gives us more flexibility. It also makes it a little easier / more convenient for users who want to store off intermediate results. One thing that I would add to either the base or an extended version is adjusted R-square. I think it is also a good idea at the point to ask what else might be missing. Your suggestions on redundancy are a good example. For now, I would suggest making RegressionResults a serializable class as we finalize its contents. One small quibble on naming: s/getNobs/getNumberOfObservations or if that is too onerous getN (similar to other stats). Regarding the model interface, I would again suggest that we just define this as a class, UpdatingOLSRegression. I suppose that if we end up implementing a weighted or other non-OLS version, we might want to factor out a common interface like what exists for MultipleLinearRegression, but in retrospect, I am not sure that interface was worth much. Note that all that we could factor out is essentially what is in MultivariateRegression, which is analogous to your RegressionResults. So, modulo the one name change, I propose to just change these to classes and get going on the implementation. Any other suggestions on what we should add / modify in the RegressionResults? > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines b
[jira] [Commented] (MATH-607) Current Multiple Regression Object does calculations with all data incore. There are non incore techniques which would be useful with large datasets.
[ https://issues.apache.org/jira/browse/MATH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060744#comment-13060744 ] Phil Steitz commented on MATH-607: -- First, thanks for pushing this along and sorry to be slow to respond. I like both of the abstractions, but I am not sure that defining interfaces is the best way to go in either case. The reporting interface (RegressionResults) could be a concrete class and it is probably best to define a base class that omits some of the reported stats (e.g. isRedundant, getRedundant). Making this a class gives us more flexibility. It also makes it a little easier / more convenient for users who want to store off intermediate results. One thing that I would add to either the base or an extended version is adjusted R-square. I think it is also a good idea at the point to ask what else might be missing. Your suggestions on redundancy are a good example. For now, I would suggest making RegressionResults a serializable class as we finalize its contents. One small quibble on naming: s/getNobs/getNumberOfObservations or if that is too onerous getN (similar to other stats). Regarding the model interface, I would again suggest that we just define this as a class, UpdatingOLSRegression. I suppose that if we end up implementing a weighted or other non-OLS version, we might want to factor out a common interface like what exists for MultipleLinearRegression, but in retrospect, I am not sure that interface was worth much. Note that all that we could factor out is essentially what is in MultivariateRegression, which is analogous to your RegressionResults. So, modulo the one name change, I propose to just change these to classes and get going on the implementation. Any other suggestions on what we should add / modify in the RegressionResults? > Current Multiple Regression Object does calculations with all data incore. > There are non incore techniques which would be useful with large datasets. > - > > Key: MATH-607 > URL: https://issues.apache.org/jira/browse/MATH-607 > Project: Commons Math > Issue Type: New Feature >Affects Versions: 3.0 > Environment: Java >Reporter: greg sterijevski > Labels: Gentleman's, QR, Regression, Updating, decomposition, > lemma > Fix For: 3.0 > > Attachments: updating_reg_ifaces > > Original Estimate: 840h > Remaining Estimate: 840h > > The current multiple regression class does a QR decomposition on the complete > data set. This necessitates the loading incore of the complete dataset. For > large datasets, or large datasets and a requirement to do datamining or > stepwise regression this is not practical. There are techniques which form > the normal equations on the fly, as well as ones which form the QR > decomposition on an update basis. I am proposing, first, the specification of > an "UpdatingLinearRegression" interface which defines basic functionality all > such techniques must fulfill. > Related to this 'updating' regression, the results of running a regression on > some subset of the data should be encapsulated in an immutable object. This > is to ensure that subsequent additions of observations do not corrupt or > render inconsistent parameter estimates. I am calling this interface > "RegressionResults". > Once the community has reached a consensus on the interface, work on the > concrete implementation of these techniques will take place. > Thanks, > -Greg -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONFIGURATION-453) Set multiple properties at once
[ https://issues.apache.org/jira/browse/CONFIGURATION-453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060602#comment-13060602 ] Alexander Prishchepov commented on CONFIGURATION-453: - On second thought - there are methods: {noformat} load() reload() save() {noformat} in FileConfiguration. Some kind of CachedConfiguration interface would probably be more generic. > Set multiple properties at once > --- > > Key: CONFIGURATION-453 > URL: https://issues.apache.org/jira/browse/CONFIGURATION-453 > Project: Commons Configuration > Issue Type: Improvement >Reporter: Alexander Prishchepov >Priority: Minor > > It might be useful to set multiple properties by one call. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (FILEUPLOAD-193) FileNotFoundException thrown by DiskFileItem.write
[ https://issues.apache.org/jira/browse/FILEUPLOAD-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060560#comment-13060560 ] Julien Vey edited comment on FILEUPLOAD-193 at 7/6/11 1:22 PM: --- I have the same issue under some undetermined conditions. Have you found any workaround to avoid this issue ? (with version 1.2.1) was (Author: veyjul): I have the same issue under some undetermined conditions. Have you found any workaround to avoid this issue ? > FileNotFoundException thrown by DiskFileItem.write > -- > > Key: FILEUPLOAD-193 > URL: https://issues.apache.org/jira/browse/FILEUPLOAD-193 > Project: Commons FileUpload > Issue Type: Bug >Affects Versions: 1.2.2 > Environment: Ubuntu 10.10 > java version "1.6.0_24" > Java(TM) SE Runtime Environment (build 1.6.0_24-b07) > Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing) >Reporter: Dan Washusen >Priority: Critical > > Under certain conditions the DiskFileItem.write throws a FileNotFound > exception. It seems to be when outputFile.renameTo(file) fails. > {code}java.io.FileNotFoundException: > /tmp/UploadController/uploading/upload_69651d04_13000a31964__8000_1651.tmp > (No such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:106) > at > org.apache.commons.fileupload.disk.DiskFileItem.write(DiskFileItem.java:447) > at upload.UploadController.handle(UploadController.java:90) > ...{code} > I can't see any obvious reason why the source file (outputFile) wouldn't > exist... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FILEUPLOAD-193) FileNotFoundException thrown by DiskFileItem.write
[ https://issues.apache.org/jira/browse/FILEUPLOAD-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060560#comment-13060560 ] Julien Vey commented on FILEUPLOAD-193: --- I have the same issue under some undetermined conditions. Have you found any workaround to avoid this issue ? > FileNotFoundException thrown by DiskFileItem.write > -- > > Key: FILEUPLOAD-193 > URL: https://issues.apache.org/jira/browse/FILEUPLOAD-193 > Project: Commons FileUpload > Issue Type: Bug >Affects Versions: 1.2.2 > Environment: Ubuntu 10.10 > java version "1.6.0_24" > Java(TM) SE Runtime Environment (build 1.6.0_24-b07) > Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing) >Reporter: Dan Washusen >Priority: Critical > > Under certain conditions the DiskFileItem.write throws a FileNotFound > exception. It seems to be when outputFile.renameTo(file) fails. > {code}java.io.FileNotFoundException: > /tmp/UploadController/uploading/upload_69651d04_13000a31964__8000_1651.tmp > (No such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:106) > at > org.apache.commons.fileupload.disk.DiskFileItem.write(DiskFileItem.java:447) > at upload.UploadController.handle(UploadController.java:90) > ...{code} > I can't see any obvious reason why the source file (outputFile) wouldn't > exist... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MATH-612) Optimisation for QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer
[ https://issues.apache.org/jira/browse/MATH-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Nix updated MATH-612: - Attachment: TriDiagonalTransformer.patch BiDiagonalTransformer.patch QRDecompositionImpl.patch > Optimisation for QRDecomposition, BiDiagonalTransformer and > TriDiagonalTransformer > -- > > Key: MATH-612 > URL: https://issues.apache.org/jira/browse/MATH-612 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0, Nightly Builds >Reporter: Christopher Nix >Priority: Minor > Labels: patch > Fix For: 3.0, Nightly Builds > > Attachments: BiDiagonalTransformer.patch, QRDecompositionImpl.patch, > TriDiagonalTransformer.patch > > > QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer all contain > methods that create an empty matrix for population in calculations employing > getEntry and setEntry on the matrix. Methods getEntry and setEntry perform a > check to ensure the matrix indices are in bounds. This check is not > necessary if the calling method has already ensured them to be in bounds, for > example by checking the max and min index as a looping parameter. > Methods within QRDecomposition, BiDiagonalTransformer and > TriDiagonalTransformer have significantly improved performance over large > matrices if, instead of creating an empty RealMatrix and then using getEntry > and setEntry, we create a double array for direct access and create a > RealMatrix from it at the end. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MATH-612) Optimisation for QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer
Optimisation for QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer -- Key: MATH-612 URL: https://issues.apache.org/jira/browse/MATH-612 Project: Commons Math Issue Type: Improvement Affects Versions: 3.0, Nightly Builds Reporter: Christopher Nix Priority: Minor Fix For: 3.0, Nightly Builds QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer all contain methods that create an empty matrix for population in calculations employing getEntry and setEntry on the matrix. Methods getEntry and setEntry perform a check to ensure the matrix indices are in bounds. This check is not necessary if the calling method has already ensured them to be in bounds, for example by checking the max and min index as a looping parameter. Methods within QRDecomposition, BiDiagonalTransformer and TriDiagonalTransformer have significantly improved performance over large matrices if, instead of creating an empty RealMatrix and then using getEntry and setEntry, we create a double array for direct access and create a RealMatrix from it at the end. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-132) Add a CompoundSubstitutor to support more than one Substitutors at a time
[ https://issues.apache.org/jira/browse/DIGESTER-132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-132. --- included in Apache Commons Digester 3.0 release > Add a CompoundSubstitutor to support more than one Substitutors at a time > - > > Key: DIGESTER-132 > URL: https://issues.apache.org/jira/browse/DIGESTER-132 > Project: Commons Digester > Issue Type: Improvement >Reporter: Tobias Demuth >Assignee: Simone Tripodi >Priority: Trivial > Fix For: 3.0 > > Attachments: CompoundSubstitutor.java, CompoundSubstitutorTest.java > > > At the moment only one Substitutor at a time is allowed to be set. If > different classes configure the same Digester - for example due to > subclassing - the set Substitutor may be overridden accidently. This can be > easily avoided by using a CompoundSubstitutor which simply chains two > Substitutors together - any input will be first handled by Substitutor A and > then by Substitutor B. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-105) [digester] Need to process [attribute id="name"]somename[/attribute]
[ https://issues.apache.org/jira/browse/DIGESTER-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-105. --- included in Apache Commons Digester 3.0 release > [digester] Need to process [attribute id="name"]somename[/attribute] > > > Key: DIGESTER-105 > URL: https://issues.apache.org/jira/browse/DIGESTER-105 > Project: Commons Digester > Issue Type: Improvement >Affects Versions: 1.6 > Environment: Operating System: other > Platform: Other >Reporter: Simon Kitching >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > Attachments: code.zip, code.zip, code.zip > > > It is reasonably common to encounter xml like > > somename > 99 > .. > > Currently there is no built-in rule to support this in Digester: > BeanPropertySetterRule supports somename > SetPropertyRule supports . > SetPropertiesRule supports > but nothing supports the first syntax listed above. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-90) [digester] xmlrules does not support setNamespaceURI
[ https://issues.apache.org/jira/browse/DIGESTER-90?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-90. -- included in Apache Commons Digester 3.0 release > [digester] xmlrules does not support setNamespaceURI > > > Key: DIGESTER-90 > URL: https://issues.apache.org/jira/browse/DIGESTER-90 > Project: Commons Digester > Issue Type: Improvement >Affects Versions: 1.6 > Environment: Operating System: All > Platform: All >Reporter: Simon Kitching >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > > It is not possible to set the namespace-uri associated with a Rule instance > created via xmlrules. This means that it is not possible to process a document > with namespaces using xmlrules. [well, it might be possible to set > namespaceAware to false, then include the prefix in the pattern, but that's a > nasty hack]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-133) Class fields are not set if class is inherited from HashMap if commons-beanutils-1.8.0 is used
[ https://issues.apache.org/jira/browse/DIGESTER-133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-133. --- included in Apache Commons Digester 3.0 release > Class fields are not set if class is inherited from HashMap if > commons-beanutils-1.8.0 is used > -- > > Key: DIGESTER-133 > URL: https://issues.apache.org/jira/browse/DIGESTER-133 > Project: Commons Digester > Issue Type: Bug >Affects Versions: 1.6, 1.7, 1.8, 2.0 > Environment: OS: Kubuntu 8.0.4, Java version is 1.5.0_15 > Windows XP, Java version 1.5.0_11-b03 >Reporter: Alexander Kovalenko >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > > Class fields are not set if class is inherited from HashMap, value is put in > HashMap instead. > I tried this simple test with Digester 1.6, 1.7, 1.8 and 2.0. It works with > commons-beanutils-1.7.0, but does not work with commons-beanutils-1.8.0. > JUnit 4.4 was used for testing. > Class to be instantiated from XML == > import java.util.HashMap; > public class MyClass extends HashMap { > private boolean flag = false; > public boolean isFlag() > { > return flag; > } > public void setFlag(boolean flag) > { > this.flag = flag; > } > } > = Test === > import static org.junit.Assert.assertTrue; > import java.io.ByteArrayInputStream; > import java.io.IOException; > import org.apache.commons.digester.Digester; > import org.junit.Test; > import org.xml.sax.SAXException; > public class TestDigester { > > @Test > public void testDigester() throws IOException, SAXException { > final String xml = ""; > > final Digester digester = new Digester(); > digester.addObjectCreate("myclass", MyClass.class); > digester.addSetProperties("myclass"); > > final MyClass res = (MyClass) digester.parse(new > ByteArrayInputStream(xml.getBytes())); > assertTrue(res.isFlag()); > } > } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-85) [digester] Include filename or uri if Digester.parse(File file or String uri throws a SAXException
[ https://issues.apache.org/jira/browse/DIGESTER-85?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-85. -- included in Apache Commons Digester 3.0 release > [digester] Include filename or uri if Digester.parse(File file or String uri > throws a SAXException > -- > > Key: DIGESTER-85 > URL: https://issues.apache.org/jira/browse/DIGESTER-85 > Project: Commons Digester > Issue Type: Improvement > Environment: Operating System: All > Platform: All >Reporter: Erik Meade >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > > Would make debugging easier. > A try catch SAXException block around the getXMLReader().parse(input); > statements in the parse > File and parse(String would allow the SAX Exception to be caught, taken > apart, an error statement > with file or uri added, and thrown. > But how to capture the first stack trace? Use NestedExceptions from > jakarta-commons-lang? Use > org.apache.commons.lang.exception.ExceptionUtils.getStackTrace and and cram > the first stack > trace in the with new SAXExceptions? Or avoid the new dependecy on lang and > do something > else? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-72) [digester] Allow SetNextRule to fire on begin
[ https://issues.apache.org/jira/browse/DIGESTER-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-72. -- included in Apache Commons Digester 3.0 release > [digester] Allow SetNextRule to fire on begin > - > > Key: DIGESTER-72 > URL: https://issues.apache.org/jira/browse/DIGESTER-72 > Project: Commons Digester > Issue Type: Improvement >Affects Versions: 1.6 > Environment: Operating System: other > Platform: All >Reporter: Simon Kitching >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > > Currently, SetNextRule always invokes the target method from its end method. > But there is no reason why it can't invoke the target from begin. This would > be > useful in cases where it is desirable to build the parent/child relationship > before processing nested xml. In particular, using BeanPropertySetterRule > against nested xml elements can cause the setter methods to be called on a > bean > before its parent/child relationship is set up and sometimes this is bad. > It should be possible to add options to the constructor of SetNextRule to > indicate if fire-at-end (existing) or fire-at-begin (new) behaviour is > desired. > Of course the xmlrules module would need to be updated too. > And this feature probably could be applied to a few other rules. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-142) Apache digester addSetProperty method is unclear and probably wrong.
[ https://issues.apache.org/jira/browse/DIGESTER-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-142. --- included in Apache Commons Digester 3.0 release > Apache digester addSetProperty method is unclear and probably wrong. > > > Key: DIGESTER-142 > URL: https://issues.apache.org/jira/browse/DIGESTER-142 > Project: Commons Digester > Issue Type: Bug >Affects Versions: 2.1 > Environment: linux, 64 bits >Reporter: i30817 >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > > The addSetProperty method call: > Digester d = new Digester(); > d.push(this); > d.addObjectCreate("rdf:RDF/pgterms:etext", Book.class); > d.addSetProperty("rdf:RDF/pgterms:etext", "rdf:ID", "setId"); > OR > d.addSetProperty("rdf:RDF/pgterms:etext", "rdf:ID", "id"); > on a class Book with the method public void setId(String) > running on this xml (simplified and not tested the simplification): > > > &pg; > Peter's Mother > De La Pasture, Henry, Mrs., > 1866-1945 > Peter's Mother by Mrs. Henry > De La Pasture > > en > > 2003-12-01 > > > > gives this exception: > Nov 8, 2010 2:42:12 PM org.apache.commons.digester.Digester startElement > SEVERE: Begin event threw exception > java.lang.NoSuchMethodException: Bean has no property named etext10452 > at > org.apache.commons.digester.SetPropertyRule.begin(SetPropertyRule.java:154) > at org.apache.commons.digester.Rule.begin(Rule.java:177) > at > org.apache.commons.digester.Digester.startElement(Digester.java:1583) > at > com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:504) > at > com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:770) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1340) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2732) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:625) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:488) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:812) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:741) > at > com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:123) > at > com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1208) > at > com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:525) > at org.apache.commons.digester.Digester.parse(Digester.java:1916) > ...(my code the rest). > To work around it use instead: > d.addSetProperties("rdf:RDF/pgterms:etext", "rdf:ID", "id"); > Doesn't make much sense to me. Also the confusion between a javadoc bean > property and a xml one is very, very misleading on the javadoc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-97) [digester] XML self references
[ https://issues.apache.org/jira/browse/DIGESTER-97?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-97. -- included in Apache Commons Digester 3.0 release > [digester] XML self references > -- > > Key: DIGESTER-97 > URL: https://issues.apache.org/jira/browse/DIGESTER-97 > Project: Commons Digester > Issue Type: Improvement >Affects Versions: 1.5 > Environment: Operating System: All > Platform: All >Reporter: Michail Medvinsky >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > > It would be very advantageous for users to add a new processing instruction > that we can use to self reference XML or other files from within a source > that > will be bound. > For example > > ?> > > â¦â¦ > > This way we can change target objects and runtime binding just by editing the > xml file with rules or editing separate rules. > The rules should be available via multiple sources: > ClassLoader > File > URL > Custom -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-127) Allow DigesterLoader to accept an instance of a preconfigured Digester
[ https://issues.apache.org/jira/browse/DIGESTER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-127. --- included in Apache Commons Digester 3.0 release > Allow DigesterLoader to accept an instance of a preconfigured Digester > -- > > Key: DIGESTER-127 > URL: https://issues.apache.org/jira/browse/DIGESTER-127 > Project: Commons Digester > Issue Type: Improvement >Affects Versions: 1.8 >Reporter: Guillaume Jeudy >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > > The problem lies in the usage of a Digester with an XSD schema combined with > loading rules from an XML file with DigesterLoader. > Digester.setSchema() is mentioned as unreliable in the javadoc, for this > reason it is preferable to create a new Digester(SAXParser) and call > SAXParser.setProperty("schemalocation_proprietary_property_name", > mySchemaLocUrl); > Unfortunately we cannot use this combination with DigesterLoader. Change > DigesterLoader to support this scenario.. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-28) [digester] Default ClassLoader policy unusable in EAR archive
[ https://issues.apache.org/jira/browse/DIGESTER-28?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-28. -- included in Apache Commons Digester 3.0 release > [digester] Default ClassLoader policy unusable in EAR archive > - > > Key: DIGESTER-28 > URL: https://issues.apache.org/jira/browse/DIGESTER-28 > Project: Commons Digester > Issue Type: Bug >Affects Versions: 1.6 > Environment: Operating System: other > Platform: Other >Reporter: Craig Miller >Assignee: Simone Tripodi > Fix For: 3.0 > > > When used in an EAR archive the Digester default classloading/resource loading > implementation makes many major frameworks unusable. For Example, if I use > Struts/Tiles (uses digester) in Web App war files and use Digester from any > EJB > component library or in the EAR classloader space either the Tiles definitions > cannot be loaded or other classes cannot be found. This is because Digester > by > default sets useContextClassloader = false. Since most users and frameworks > (Struts, Tiles, JSF, etc) do not set useContextClassloader = true, Digester > essentially breaks enterprise Applications where the Digester is used from > more > than one module. Note that end users do not control the uses of Digester, the > default useContextClassloader policy should = true. > Patch by changing: > useContextClassloader = false > to: > useContextClassloader = true > // > This solves the problem - which Google turns up endless hits. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-143) Unintuitive, possibly broken behaviour.
[ https://issues.apache.org/jira/browse/DIGESTER-143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-143. --- Assignee: Simone Tripodi included in Apache Commons Digester 3.0 release > Unintuitive, possibly broken behaviour. > --- > > Key: DIGESTER-143 > URL: https://issues.apache.org/jira/browse/DIGESTER-143 > Project: Commons Digester > Issue Type: Improvement >Affects Versions: 2.1 > Environment: Linux 64 bits >Reporter: i30817 >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > Attachments: Main.java > > > When one tag is a substring of another tag and both are used as xml patterns, > such as: > "rdf:RDF/pgterms:etext/dc:creator" > "rdf:RDF/pgterms:etext/dc:creator/rdf:Bag/rdf:li" (instances in a collection > of creators) > the callback to the first is called once with empty strings ("") as the body > of the tag if it encounters a instance of the second tag. > If additionally the first tag has a argument > "rdf:RDF/pgterms:etext/dc:creator rdf:type="Literal"" and you bind things > correctly for a two arguments callback with the text body and the argument > value ("Literal"), the callback of the first will be called with a null > rdf:type argument and a empty string as the body. You can use this to > distinguish the bogus callback as a workaround, but requires a additional > bean method and confusing binding too. > The best thing would be if these empty strings callback were avoidable. I can > just test for empty string in the callback, however, i'm trying also to > create assertions on the xml content (including, no empty strings). > If the bogus callback must exist, one of the ways to make it obvious and > distinguishable, would be to use null as the default value, instead of "". No > xml document will have the computation null value, unless some very strange > java binding is happening, in which case, you're asking for trouble. "" is > even worse, because it is common to both domains. > So RFE: > 1) Avoid the bogus callback of the smaller xml tree branch if you can. > 2) if you can't, use null as a default value instead of the indistinguishable > "". > 3) if 2) document this behavior in the javadoc and the digester FAQ. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-103) [digester] xmlrules does not support NodeCreateRule
[ https://issues.apache.org/jira/browse/DIGESTER-103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-103. --- included in Apache Commons Digester 3.0 release > [digester] xmlrules does not support NodeCreateRule > --- > > Key: DIGESTER-103 > URL: https://issues.apache.org/jira/browse/DIGESTER-103 > Project: Commons Digester > Issue Type: Improvement >Affects Versions: 1.6 > Environment: Operating System: All > Platform: All >Reporter: Simon Kitching >Assignee: Simone Tripodi >Priority: Minor > Fix For: 3.0 > > > xmlrules does not support NodeCreateRule -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-123) xmlrules dtd does not define xmlattrs for node-create-rule
[ https://issues.apache.org/jira/browse/DIGESTER-123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-123. --- included in Apache Commons Digester 3.0 release > xmlrules dtd does not define xmlattrs for node-create-rule > -- > > Key: DIGESTER-123 > URL: https://issues.apache.org/jira/browse/DIGESTER-123 > Project: Commons Digester > Issue Type: Bug >Affects Versions: 1.8 >Reporter: Simon Kitching >Assignee: Simone Tripodi > Fix For: 3.0 > > > As reported by Jianguo Zhang, the file digester-rules.dtd has what appears to > be a copy-and-paste error, defining the xml attrs for object-param-rule a > second time, when it should be defining xml attrs for node-create-rule. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-140) Change the getClass().getClassLoader() to Thread.currentThread().getContextClassLoader()
[ https://issues.apache.org/jira/browse/DIGESTER-140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-140. --- included in Apache Commons Digester 3.0 release > Change the getClass().getClassLoader() to > Thread.currentThread().getContextClassLoader() > > > Key: DIGESTER-140 > URL: https://issues.apache.org/jira/browse/DIGESTER-140 > Project: Commons Digester > Issue Type: Bug >Affects Versions: 2.0 > Environment: Apache Tomcat 5.5.9 , >Reporter: ashwin kumar >Assignee: Simone Tripodi > Labels: ClassLoader, Problem > Fix For: 3.0 > > Original Estimate: 2m > Remaining Estimate: 2m > > When i use Digester from Java Agent Few pieces of code which uses > getClass().getClassLoader() returns me null. As Digester is loaded by system > classlaoder it happens so > Fix : using following code > "Thread.currentThread().getContextClassLoader()" shal fix that > Locations to Fix : > ClassName :org.apache.commons.digester.SetNextRule > Method : end() > Line Number: 204 > ClassName :org.apache.commons.digester.xmlrules.FromXmlRuleSet > Method : addRuleInstances() > Line Number: 164 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-144) Add OSGi headers to jar files.
[ https://issues.apache.org/jira/browse/DIGESTER-144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-144. --- included in Apache Commons Digester 3.0 release > Add OSGi headers to jar files. > -- > > Key: DIGESTER-144 > URL: https://issues.apache.org/jira/browse/DIGESTER-144 > Project: Commons Digester > Issue Type: Improvement >Reporter: Kirby Bohling > Fix For: 3.0 > > Attachments: osgi_headers.patch > > > I am using Apache Felix and attempting to use commons-digester. It would be > much easier if the official build include the OSGi Manifest headers. I have > modified the Maven POM to make the generated jar also be a valid OSGi bundle. > Patch to be attached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-131) [PATCH] commons-digester / Allow recursive match in ExtendedBaseRules.java (see patch)
[ https://issues.apache.org/jira/browse/DIGESTER-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-131. --- included in Apache Commons Digester 3.0 release > [PATCH] commons-digester / Allow recursive match in ExtendedBaseRules.java > (see patch) > -- > > Key: DIGESTER-131 > URL: https://issues.apache.org/jira/browse/DIGESTER-131 > Project: Commons Digester > Issue Type: Improvement > Environment: all >Reporter: Volker Karlmeier >Assignee: Simone Tripodi > Fix For: 3.0 > > Attachments: ExtendedBaseRules.patch > > Original Estimate: 10m > Remaining Estimate: 10m > > Recursive tags in XML-rules-file only work on root node. Nested nodes like > the one below to not work. > With the attached patch, it is possible to specify rules like > <...> > <...> > > classname="de.wsy.f4ja.alertbatches.configuration.alerting.alertingconfig.Properties" > /> > > > > > > > classname="de.wsy.f4ja.alertbatches.configuration.alerting.alertingconfig.Property" > /> > > > > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-139) DigesterRuleParser - many of the nested classes could be static
[ https://issues.apache.org/jira/browse/DIGESTER-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-139. --- included in Apache Commons Digester 3.0 release > DigesterRuleParser - many of the nested classes could be static > --- > > Key: DIGESTER-139 > URL: https://issues.apache.org/jira/browse/DIGESTER-139 > Project: Commons Digester > Issue Type: Improvement >Reporter: Sebb >Assignee: Simone Tripodi > Fix For: 3.0 > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-134) Bug in SetPropertyRule
[ https://issues.apache.org/jira/browse/DIGESTER-134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-134. --- included in Apache Commons Digester 3.0 release > Bug in SetPropertyRule > -- > > Key: DIGESTER-134 > URL: https://issues.apache.org/jira/browse/DIGESTER-134 > Project: Commons Digester > Issue Type: Bug >Affects Versions: 2.0 >Reporter: Alexander Krasnukhin >Assignee: Simone Tripodi > Fix For: 3.0 > > > In method begin(Attributes attributes) there seems like something wrong there: > if (name.equals(this.name)) { > actualName = value; > } else if (name.equals(this.value)) { > actualValue = value; > } > But actually must me fixed this way: > if (name.equals(this.name)) { > actualName = name; > actualValue = value; > break; > } > It wont work without that fix. I tested. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-137) Public/protected static fields which intended as constants, but which are not marked final
[ https://issues.apache.org/jira/browse/DIGESTER-137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-137. --- included in Apache Commons Digester 3.0 release > Public/protected static fields which intended as constants, but which are not > marked final > -- > > Key: DIGESTER-137 > URL: https://issues.apache.org/jira/browse/DIGESTER-137 > Project: Commons Digester > Issue Type: Bug >Reporter: Sebb >Assignee: Simone Tripodi > Fix For: 3.0 > > Attachments: DIGESTER-137.patch > > > Some digester classes contain public/protected static fields which intended > as constants, but which are not marked final. > Such fields should be marked final to avoid malicious or accidental > corruption of the value. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (DIGESTER-118) ObjectCreateRule shouldn't keep className as a field
[ https://issues.apache.org/jira/browse/DIGESTER-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simone Tripodi closed DIGESTER-118. --- included in Apache Commons Digester 3.0 release > ObjectCreateRule shouldn't keep className as a field > > > Key: DIGESTER-118 > URL: https://issues.apache.org/jira/browse/DIGESTER-118 > Project: Commons Digester > Issue Type: Bug >Affects Versions: 1.7 >Reporter: Kohsuke Kawaguchi >Assignee: Simone Tripodi > Fix For: 3.0 > > > Currently ObjectCreateRule refers to the class by using the name, but this is > highly undesirable. > 1. "begin" uses the classloader that loaded Digester to resolve this class > name, but this won't work in multi-classloader environment (like IDE, Maven, > etc.) > 2. "begin" invokes the loadClass method each time a new object is created. > This is unnecessary performance hit. > The proper thing to do is to retain the Class object, and convert String to > Class in the constructor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-611) A fast and stable SVD implementation from JAMA
[ https://issues.apache.org/jira/browse/MATH-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060478#comment-13060478 ] Christopher Nix commented on MATH-611: -- With regard to speed, I've tried it on random arrays with up to 1,000,000 elements and it's definitely faster than the current SVD implementation. The current SVD implementation is noticeably slowed by the implementations of AbstractRealMatrix and TriDiagonalTransformer that use setEntry and getEntry on matrices within loops. As such, matrix indices are checked to be in bounds at every iteration of a loop, when what we really should do is check the max and min indices only, if at all. (I've a further patch yet to submit on this). If we remove the use of getEntry and setEntry in methods called by the current SVD implementation, and instead use direct array access, then the speed of the current implementation is improved significantly, however the stability remains unchanged. Even with this change, the JAMA code is about twice as fast to converge (according to my profiler) and it is demonstrably more accurate. Chris. > A fast and stable SVD implementation from JAMA > -- > > Key: MATH-611 > URL: https://issues.apache.org/jira/browse/MATH-611 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0, Nightly Builds >Reporter: Christopher Nix > Labels: patch > Fix For: 3.0, Nightly Builds > > Attachments: SingularValueDecompositionImpl.java > > > Common numerical stability issues with the current SVD implementation, ie > MATH-327, MATH-383, MATH-465, MATH-583 can all be solved by co-opting JAMA > code that is within the public domain. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-577) Enhance Complex.java
[ https://issues.apache.org/jira/browse/MATH-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060477#comment-13060477 ] Gilles commented on MATH-577: - {quote} test for scalar rhs missing. {quote} Could add them then? [If possible, one test method per test.] Thank you. > Enhance Complex.java > > > Key: MATH-577 > URL: https://issues.apache.org/jira/browse/MATH-577 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0 >Reporter: Arne Plöse >Priority: Minor > Attachments: Complex.diff, Complex.diff > > > Add some double shorthand methods to Complex fix different NaN checks in add > and subtract ! Testcase testAddNaN will fail (what should be the result ?) > What is missing JavaDoc and testcases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (SANDBOX-339) [Dot Export] Adding vertex attribute
[ https://issues.apache.org/jira/browse/SANDBOX-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060459#comment-13060459 ] Marco Speranza commented on SANDBOX-339: Simo, I'm not totaly agree with you, common-graph implements the export logic, the user has only the flexibility to customize the attribute of the vertices, with a key-value map for each vertices. Furthermore the user has to know these values even if we create an adapter. He has to learn the possible keys-attribute and their acceptable values in any cases. Anyway thinking more we can try to do this: * create a simple CSV file that contains all possible values, with type and minimum acceptable value for each keys. (we can export this directly by [this site|http://www.graphviz.org/content/attrs] ) * we can create an adaper that checks the values any time that the user adds a new value for a vertex. * so we have to upgrade only the this CSV file in order to upgrade the acceptable values WDYT? have a nice day Simo ;) > [Dot Export] Adding vertex attribute > > > Key: SANDBOX-339 > URL: https://issues.apache.org/jira/browse/SANDBOX-339 > Project: Commons Sandbox > Issue Type: Improvement > Components: Graph >Reporter: Marco Speranza >Priority: Minor > Attachments: DotExport-AddingVertexAttribute.patch > > > Hi folk, I just made a little improvement to Dot Export. I added the > possibility to customize the attributes of the vertices through a > {{Map}} > Looking forward your comments. > bye -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (SANDBOX-339) [Dot Export] Adding vertex attribute
[ https://issues.apache.org/jira/browse/SANDBOX-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060446#comment-13060446 ] Simone Tripodi commented on SANDBOX-339: {{Properties}} is *too much* flexible, that's true that allows DOT exporter be adaptive to further DOT versions, but at the same time the {{DotExport}} loses all its power when producing corrupted outputs. If the user has to check by *it*self the syntax/semantic correctness of each attribute... why use the {{commons-graph}} dot exporter? In that way users are forced to learn the language, while the purpose should be producing transparently the {{*.dot}} files and open with a rendered provided by a 3rd party. Otherwise, just a little more spicy and the user can write its smarter version :) I'd prefer producing automatically a working subset of the DOT language rather than be so flexible to produce potentially corrupted outputs. A different strategy is needed, let's think about it > [Dot Export] Adding vertex attribute > > > Key: SANDBOX-339 > URL: https://issues.apache.org/jira/browse/SANDBOX-339 > Project: Commons Sandbox > Issue Type: Improvement > Components: Graph >Reporter: Marco Speranza >Priority: Minor > Attachments: DotExport-AddingVertexAttribute.patch > > > Hi folk, I just made a little improvement to Dot Export. I added the > possibility to customize the attributes of the vertices through a > {{Map}} > Looking forward your comments. > bye -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-611) A fast and stable SVD implementation from JAMA
[ https://issues.apache.org/jira/browse/MATH-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060433#comment-13060433 ] Mikkel Meyer Andersen commented on MATH-611: Thanks a lot for adding this. I think that we should document in the comments to the code that it is based on JAMA. We could maybe also contact JAMA and inform about our considerations of adopting/using the code. I can see that the last version is from 2005, but bugs might still be reported, and it would be great to benefit from this. And users of JAMA might also benefit from future improvements made by us. Have you made any comparison with the existing in regards to speed? Cheers, Mikkel. > A fast and stable SVD implementation from JAMA > -- > > Key: MATH-611 > URL: https://issues.apache.org/jira/browse/MATH-611 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0, Nightly Builds >Reporter: Christopher Nix > Labels: patch > Fix For: 3.0, Nightly Builds > > Attachments: SingularValueDecompositionImpl.java > > > Common numerical stability issues with the current SVD implementation, ie > MATH-327, MATH-383, MATH-465, MATH-583 can all be solved by co-opting JAMA > code that is within the public domain. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (MATH-611) A fast and stable SVD implementation from JAMA
[ https://issues.apache.org/jira/browse/MATH-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060433#comment-13060433 ] Mikkel Meyer Andersen edited comment on MATH-611 at 7/6/11 9:24 AM: Thanks a lot for adding this. I think that we should document in the comments to the code that it is based on JAMA. We could maybe also contact JAMA and inform about our considerations of adopting/using the code. I can see that the last version is from 2005, but bugs might still be reported, and it would be great to benefit from this. And users of JAMA might also benefit from future improvements made by us. What do you think about this? Have you made any comparison with the existing in regards to speed? Cheers, Mikkel. was (Author: mikl): Thanks a lot for adding this. I think that we should document in the comments to the code that it is based on JAMA. We could maybe also contact JAMA and inform about our considerations of adopting/using the code. I can see that the last version is from 2005, but bugs might still be reported, and it would be great to benefit from this. And users of JAMA might also benefit from future improvements made by us. Have you made any comparison with the existing in regards to speed? Cheers, Mikkel. > A fast and stable SVD implementation from JAMA > -- > > Key: MATH-611 > URL: https://issues.apache.org/jira/browse/MATH-611 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0, Nightly Builds >Reporter: Christopher Nix > Labels: patch > Fix For: 3.0, Nightly Builds > > Attachments: SingularValueDecompositionImpl.java > > > Common numerical stability issues with the current SVD implementation, ie > MATH-327, MATH-383, MATH-465, MATH-583 can all be solved by co-opting JAMA > code that is within the public domain. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (MATH-611) A fast and stable SVD implementation from JAMA
[ https://issues.apache.org/jira/browse/MATH-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060405#comment-13060405 ] Christopher Nix edited comment on MATH-611 at 7/6/11 8:34 AM: -- Attached SingularValueDecomposition.java containing JAMA code wrapped into Commons Math API. was (Author: joubert): JAMA code wrapped into Commons Math API. > A fast and stable SVD implementation from JAMA > -- > > Key: MATH-611 > URL: https://issues.apache.org/jira/browse/MATH-611 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0, Nightly Builds >Reporter: Christopher Nix > Labels: patch > Fix For: 3.0, Nightly Builds > > Attachments: SingularValueDecompositionImpl.java > > > Common numerical stability issues with the current SVD implementation, ie > MATH-327, MATH-383, MATH-465, MATH-583 can all be solved by co-opting JAMA > code that is within the public domain. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MATH-611) A fast and stable SVD implementation from JAMA
[ https://issues.apache.org/jira/browse/MATH-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Nix updated MATH-611: - Attachment: SingularValueDecompositionImpl.java JAMA code wrapped into Commons Math API. > A fast and stable SVD implementation from JAMA > -- > > Key: MATH-611 > URL: https://issues.apache.org/jira/browse/MATH-611 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0, Nightly Builds >Reporter: Christopher Nix > Labels: patch > Fix For: 3.0, Nightly Builds > > Attachments: SingularValueDecompositionImpl.java > > > Common numerical stability issues with the current SVD implementation, ie > MATH-327, MATH-383, MATH-465, MATH-583 can all be solved by co-opting JAMA > code that is within the public domain. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MATH-611) A fast and stable SVD implementation from JAMA
A fast and stable SVD implementation from JAMA -- Key: MATH-611 URL: https://issues.apache.org/jira/browse/MATH-611 Project: Commons Math Issue Type: Improvement Affects Versions: 3.0, Nightly Builds Reporter: Christopher Nix Fix For: 3.0, Nightly Builds Common numerical stability issues with the current SVD implementation, ie MATH-327, MATH-383, MATH-465, MATH-583 can all be solved by co-opting JAMA code that is within the public domain. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (SANDBOX-339) [Dot Export] Adding vertex attribute
[ https://issues.apache.org/jira/browse/SANDBOX-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060400#comment-13060400 ] Marco Speranza commented on SANDBOX-339: Hi Simo, I chose the {{Properties}} type right for that: it's very flexible. The DOT language isn't so small (see [this|http://www.graphviz.org/content/attrs] ) so I think that create a POJO or an adapter is very painfull (think only for the semantic check for all of these field) and furthermore we have to upgrade the parser/adapter all the time that the DOT language changes. IMHO the the user should check by youself the syntax/semantic correctness for each attributes and with flexible APIs, he hasn't constraints and he can mix any attribute values. WDYT? ciao > [Dot Export] Adding vertex attribute > > > Key: SANDBOX-339 > URL: https://issues.apache.org/jira/browse/SANDBOX-339 > Project: Commons Sandbox > Issue Type: Improvement > Components: Graph >Reporter: Marco Speranza >Priority: Minor > Attachments: DotExport-AddingVertexAttribute.patch > > > Hi folk, I just made a little improvement to Dot Export. I added the > possibility to customize the attributes of the vertices through a > {{Map}} > Looking forward your comments. > bye -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MATH-577) Enhance Complex.java
[ https://issues.apache.org/jira/browse/MATH-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arne Plöse updated MATH-577: Attachment: Complex.diff Some enhancements for scalar rhs. With javadocs and fixed? test case of testAddNaN(Complex) replaced call to isNaN() with field access isNaN. test for scalar rhs missing. > Enhance Complex.java > > > Key: MATH-577 > URL: https://issues.apache.org/jira/browse/MATH-577 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 3.0 >Reporter: Arne Plöse >Priority: Minor > Attachments: Complex.diff, Complex.diff > > > Add some double shorthand methods to Complex fix different NaN checks in add > and subtract ! Testcase testAddNaN will fail (what should be the result ?) > What is missing JavaDoc and testcases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira