[jira] [Commented] (CSV-196) Store the information of raw data read by lexer
[ https://issues.apache.org/jira/browse/CSV-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156340#comment-16156340 ] Matt Sun commented on CSV-196: -- This is exactly what I'm proposing. We should track both, with the byte position be optional. It's completely backward compatible. > Store the information of raw data read by lexer > --- > > Key: CSV-196 > URL: https://issues.apache.org/jira/browse/CSV-196 > Project: Commons CSV > Issue Type: Improvement > Components: Parser >Affects Versions: 1.4 >Reporter: Matt Sun > Labels: patch > Original Estimate: 48h > Remaining Estimate: 48h > > It will be good to have CSVParser class to store the info of whether a field > was enclosed by quotes in the original source file. > For example, for this data sample: > A, B, C > a1, "b1", c1 > CSVParser gives us record a1, b1, c1, which is helpful because it parsed > double quotes, but we also lost the information of original data at the same > time. We can't tell from the CSVRecord returned whether the original data is > enclosed by double quotes or not. > In our use case, we are integrating Apache Hadoop APIs with Commons CSV. CSV > is one kind of input of Hadoop Jobs, which should support splitting input > data. To accurately split a CSV file into pieces, we need to count the bytes > of data CSVParser actually read. CSVParser doesn't have accurate information > of whether a field was enclosed by quotes, neither does it store raw data of > the original source. Downstream users of commons CSVParser is not able to get > those info. > To suggest a fix: Extend the token/CSVRecord to have a boolean field > indicating whether the column was enclosed by quotes. While Lexer is doing > getNextToken, set the flag if a field is encapsulated and successfully parsed. > I find another issue reported with similar request, but it was marked as > resolved: [CSV91] > https://issues.apache.org/jira/browse/CSV-91?jql=project%20%3D%20CSV%20AND%20text%20~%20%22with%20quotes%22 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEXT-101) Move RandomStringUtils to Commons-text
[ https://issues.apache.org/jira/browse/TEXT-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156327#comment-16156327 ] ASF GitHub Bot commented on TEXT-101: - Github user coveralls commented on the issue: https://github.com/apache/commons-text/pull/62 [![Coverage Status](https://coveralls.io/builds/13157503/badge)](https://coveralls.io/builds/13157503) Coverage decreased (-0.03%) to 98.259% when pulling **00011e92477ca610a313c40ed749512e5443ffa2 on ameyjadiye:TEXT-101** into **30a29e2cf20e4c2036b423399a91cd01cd1759b6 on apache:master**. > Move RandomStringUtils to Commons-text > -- > > Key: TEXT-101 > URL: https://issues.apache.org/jira/browse/TEXT-101 > Project: Commons Text > Issue Type: Task >Reporter: Amey Jadiye > Fix For: 1.2 > > > This task contains not just moving but , > 1. Keep RandomStringUtils deprecated in commons-lang. > 2. Move RandomStringUtils class to commons-text. > 3. Remove all existing code from methods of RandomStringUtils and call our > brand new RandomStringGenerator in them. to return respective values i.e. > randomNumeric, randomAlphabetic, randomAlphanumeric etc -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CSV-196) Store the information of raw data read by lexer
[ https://issues.apache.org/jira/browse/CSV-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156194#comment-16156194 ] Gary Gregory commented on CSV-196: -- A character is different that a byte, so maybe we need to track both the character position and the byte position. Some folks might rely on the current behavior... Thoughts? > Store the information of raw data read by lexer > --- > > Key: CSV-196 > URL: https://issues.apache.org/jira/browse/CSV-196 > Project: Commons CSV > Issue Type: Improvement > Components: Parser >Affects Versions: 1.4 >Reporter: Matt Sun > Labels: patch > Original Estimate: 48h > Remaining Estimate: 48h > > It will be good to have CSVParser class to store the info of whether a field > was enclosed by quotes in the original source file. > For example, for this data sample: > A, B, C > a1, "b1", c1 > CSVParser gives us record a1, b1, c1, which is helpful because it parsed > double quotes, but we also lost the information of original data at the same > time. We can't tell from the CSVRecord returned whether the original data is > enclosed by double quotes or not. > In our use case, we are integrating Apache Hadoop APIs with Commons CSV. CSV > is one kind of input of Hadoop Jobs, which should support splitting input > data. To accurately split a CSV file into pieces, we need to count the bytes > of data CSVParser actually read. CSVParser doesn't have accurate information > of whether a field was enclosed by quotes, neither does it store raw data of > the original source. Downstream users of commons CSVParser is not able to get > those info. > To suggest a fix: Extend the token/CSVRecord to have a boolean field > indicating whether the column was enclosed by quotes. While Lexer is doing > getNextToken, set the flag if a field is encapsulated and successfully parsed. > I find another issue reported with similar request, but it was marked as > resolved: [CSV91] > https://issues.apache.org/jira/browse/CSV-91?jql=project%20%3D%20CSV%20AND%20text%20~%20%22with%20quotes%22 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (CSV-196) Store the information of raw data read by lexer
[ https://issues.apache.org/jira/browse/CSV-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156165#comment-16156165 ] Matt Sun edited comment on CSV-196 at 9/6/17 10:58 PM: --- I'm reopening this issue because I found that getCharacterPosition doesn't serve the purpose when the characters are multiple bytes. I will submit a pull request on Github to suggest a fix. was (Author: mattsun): I'm reopening this issue because I found that getCharacterPosition doesn't serve the position when the characters are multiple bytes. I will submit a pull request on Github to suggest a fix. > Store the information of raw data read by lexer > --- > > Key: CSV-196 > URL: https://issues.apache.org/jira/browse/CSV-196 > Project: Commons CSV > Issue Type: Improvement > Components: Parser >Affects Versions: 1.4 >Reporter: Matt Sun > Labels: patch > Original Estimate: 48h > Remaining Estimate: 48h > > It will be good to have CSVParser class to store the info of whether a field > was enclosed by quotes in the original source file. > For example, for this data sample: > A, B, C > a1, "b1", c1 > CSVParser gives us record a1, b1, c1, which is helpful because it parsed > double quotes, but we also lost the information of original data at the same > time. We can't tell from the CSVRecord returned whether the original data is > enclosed by double quotes or not. > In our use case, we are integrating Apache Hadoop APIs with Commons CSV. CSV > is one kind of input of Hadoop Jobs, which should support splitting input > data. To accurately split a CSV file into pieces, we need to count the bytes > of data CSVParser actually read. CSVParser doesn't have accurate information > of whether a field was enclosed by quotes, neither does it store raw data of > the original source. Downstream users of commons CSVParser is not able to get > those info. > To suggest a fix: Extend the token/CSVRecord to have a boolean field > indicating whether the column was enclosed by quotes. While Lexer is doing > getNextToken, set the flag if a field is encapsulated and successfully parsed. > I find another issue reported with similar request, but it was marked as > resolved: [CSV91] > https://issues.apache.org/jira/browse/CSV-91?jql=project%20%3D%20CSV%20AND%20text%20~%20%22with%20quotes%22 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (CSV-196) Store the information of raw data read by lexer
[ https://issues.apache.org/jira/browse/CSV-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Sun reopened CSV-196: -- I'm reopening this issue because I found that getCharacterPosition doesn't serve the position when the characters are multiple bytes. I will submit a pull request on Github to suggest a fix. > Store the information of raw data read by lexer > --- > > Key: CSV-196 > URL: https://issues.apache.org/jira/browse/CSV-196 > Project: Commons CSV > Issue Type: Improvement > Components: Parser >Affects Versions: 1.4 >Reporter: Matt Sun > Labels: patch > Original Estimate: 48h > Remaining Estimate: 48h > > It will be good to have CSVParser class to store the info of whether a field > was enclosed by quotes in the original source file. > For example, for this data sample: > A, B, C > a1, "b1", c1 > CSVParser gives us record a1, b1, c1, which is helpful because it parsed > double quotes, but we also lost the information of original data at the same > time. We can't tell from the CSVRecord returned whether the original data is > enclosed by double quotes or not. > In our use case, we are integrating Apache Hadoop APIs with Commons CSV. CSV > is one kind of input of Hadoop Jobs, which should support splitting input > data. To accurately split a CSV file into pieces, we need to count the bytes > of data CSVParser actually read. CSVParser doesn't have accurate information > of whether a field was enclosed by quotes, neither does it store raw data of > the original source. Downstream users of commons CSVParser is not able to get > those info. > To suggest a fix: Extend the token/CSVRecord to have a boolean field > indicating whether the column was enclosed by quotes. While Lexer is doing > getNextToken, set the flag if a field is encapsulated and successfully parsed. > I find another issue reported with similar request, but it was marked as > resolved: [CSV91] > https://issues.apache.org/jira/browse/CSV-91?jql=project%20%3D%20CSV%20AND%20text%20~%20%22with%20quotes%22 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (LANG-1345) Add methods to enhance non-empty strings
[ https://issues.apache.org/jira/browse/LANG-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156153#comment-16156153 ] Sebb commented on LANG-1345: Sorry, but I think this is straying far beyond the remit of the LANG library. It's probably also not appropriate for the TEXT library. IMO this sort of code belongs in the application. > Add methods to enhance non-empty strings > > > Key: LANG-1345 > URL: https://issues.apache.org/jira/browse/LANG-1345 > Project: Commons Lang > Issue Type: Improvement > Components: lang.* >Reporter: Mitth'raw'nuruodo >Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > As a corollary to {{defaultString}}, {{appendIfMissing}}, and similar > methods, which allow the client to replace a missing string with something, > it would sometimes be helpful to have methods that enhance a string, if it is > present, with a prefix and suffix (or perhaps with an arbitrary function). > Eg > {code} > if (StringUtils.isNotEmpty(queryString)) { > return url + "?" + queryString; > } else { > return url; > } > {code} > would become: > {code} > return url + StringUtils.addIfNotEmpty(queryString, "?", null); > {code} > with the second parameter being the prefix, and the third, the suffix. If the > string is null or empty, return empty string. > If this could accept arbitrary transformation functions, it could effectively > become the Elvis operator. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (VFS-643) VFS should not log at the INFO level
[ https://issues.apache.org/jira/browse/VFS-643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Gregory updated VFS-643: - Description: Due to the low-level nature of this library, we should not log at the INFO level. There is only one event logged at the INFO level ATM. This changes the level to DEBUG: {noformat} vfs.impl/temp-dir.debug=Using "{0}" as temporary files store. {noformat} was: Due to the low-level nature of this library, we should not log at the INFO level. There is only one event logged at the INFO level ATM. This changes the level to DEBUG. > VFS should not log at the INFO level > > > Key: VFS-643 > URL: https://issues.apache.org/jira/browse/VFS-643 > Project: Commons VFS > Issue Type: Improvement >Affects Versions: 2.0, 2.1 >Reporter: Gary Gregory >Assignee: Gary Gregory > Fix For: 2.2 > > > Due to the low-level nature of this library, we should not log at the INFO > level. > There is only one event logged at the INFO level ATM. This changes the level > to DEBUG: > {noformat} > vfs.impl/temp-dir.debug=Using "{0}" as temporary files store. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (VFS-643) VFS should not log at the INFO level
[ https://issues.apache.org/jira/browse/VFS-643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Gregory closed VFS-643. Resolution: Fixed Fix Version/s: 2.2 In SVN trunk. > VFS should not log at the INFO level > > > Key: VFS-643 > URL: https://issues.apache.org/jira/browse/VFS-643 > Project: Commons VFS > Issue Type: Improvement >Affects Versions: 2.0, 2.1 >Reporter: Gary Gregory >Assignee: Gary Gregory > Fix For: 2.2 > > > Due to the low-level nature of this library, we should not log at the INFO > level. > There is only one event logged at the INFO level ATM. This changes the level > to DEBUG: > {noformat} > vfs.impl/temp-dir.debug=Using "{0}" as temporary files store. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (VFS-643) VFS should not log at the INFO level
Gary Gregory created VFS-643: Summary: VFS should not log at the INFO level Key: VFS-643 URL: https://issues.apache.org/jira/browse/VFS-643 Project: Commons VFS Issue Type: Improvement Affects Versions: 2.1, 2.0 Reporter: Gary Gregory Assignee: Gary Gregory Due to the low-level nature of this library, we should not log at the INFO level. There is only one event logged at the INFO level ATM. This changes the level to DEBUG. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEXT-101) Move RandomStringUtils to Commons-text
[ https://issues.apache.org/jira/browse/TEXT-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156017#comment-16156017 ] ASF GitHub Bot commented on TEXT-101: - GitHub user ameyjadiye opened a pull request: https://github.com/apache/commons-text/pull/62 TEXT-101: Moved RandomStringUtils to Commons-text Moved **RandomStringUtils** from lang to text along with changed core logic which is using **RandomStringGenerator** internally, Test cases are copied from commons-lang for making sure its working as it's working in commons-lang. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ameyjadiye/commons-text TEXT-101 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/commons-text/pull/62.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #62 commit aebacbabaa3debae909081cbd55475f6f108d141 Author: Amey JadiyeDate: 2017-09-06T20:40:11Z copied RandomStringGenerator required code from master commit e45738f00417ac0dcae1784c7904dc6fc156ae32 Author: Amey Jadiye Date: 2017-09-06T20:41:51Z TEXT-101 : Moved RandomStringUtils from lang to text with improved core logic commit 00011e92477ca610a313c40ed749512e5443ffa2 Author: Amey Jadiye Date: 2017-09-06T20:43:33Z added Amey's email id > Move RandomStringUtils to Commons-text > -- > > Key: TEXT-101 > URL: https://issues.apache.org/jira/browse/TEXT-101 > Project: Commons Text > Issue Type: Task >Reporter: Amey Jadiye > Fix For: 1.2 > > > This task contains not just moving but , > 1. Keep RandomStringUtils deprecated in commons-lang. > 2. Move RandomStringUtils class to commons-text. > 3. Remove all existing code from methods of RandomStringUtils and call our > brand new RandomStringGenerator in them. to return respective values i.e. > randomNumeric, randomAlphabetic, randomAlphanumeric etc -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JEXL-239) Add NPE check to property for MapGetExecutor
[ https://issues.apache.org/jira/browse/JEXL-239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155587#comment-16155587 ] Henri Biestro commented on JEXL-239: Backporting JEXL3 checks / fix src/main/java/org/apache/commons/jexl2/internal/BooleanGetExecutor.java src/main/java/org/apache/commons/jexl2/internal/DuckGetExecutor.java src/main/java/org/apache/commons/jexl2/internal/DuckSetExecutor.java src/main/java/org/apache/commons/jexl2/internal/MapGetExecutor.java src/main/java/org/apache/commons/jexl2/internal/MapSetExecutor.java Committed revision 1807495. > Add NPE check to property for MapGetExecutor > > > Key: JEXL-239 > URL: https://issues.apache.org/jira/browse/JEXL-239 > Project: Commons JEXL > Issue Type: Improvement >Affects Versions: 2.1.1 >Reporter: Bruno P. Kinoshita >Priority: Minor > > From GitHub pull request #2 https://github.com/apache/commons-jexl/pull/2 > {quote} > When the MapGetExecutor is init by the key null, the property of > MapGetExecutor will be null. > If the MapGetExecutor is cached and the key is changed (like map[index]), > there will throw an NPE. > I think the intention for the condition is to compare the class > compatibility, so I add the NPE check here. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JEXL-239) Add NPE check to property for MapGetExecutor
[ https://issues.apache.org/jira/browse/JEXL-239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155547#comment-16155547 ] Henri Biestro commented on JEXL-239: JEXL-221 is the JEXL3 instance of this (fixed) bug > Add NPE check to property for MapGetExecutor > > > Key: JEXL-239 > URL: https://issues.apache.org/jira/browse/JEXL-239 > Project: Commons JEXL > Issue Type: Improvement >Affects Versions: 2.1.1 >Reporter: Bruno P. Kinoshita >Priority: Minor > > From GitHub pull request #2 https://github.com/apache/commons-jexl/pull/2 > {quote} > When the MapGetExecutor is init by the key null, the property of > MapGetExecutor will be null. > If the MapGetExecutor is cached and the key is changed (like map[index]), > there will throw an NPE. > I think the intention for the condition is to compare the class > compatibility, so I add the NPE check here. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (JEXL-238) Restrict getLiteralClass to a Number for NumberLiterals
[ https://issues.apache.org/jira/browse/JEXL-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henri Biestro updated JEXL-238: --- Affects Version/s: 3.1 > Restrict getLiteralClass to a Number for NumberLiterals > --- > > Key: JEXL-238 > URL: https://issues.apache.org/jira/browse/JEXL-238 > Project: Commons JEXL > Issue Type: Improvement >Affects Versions: 3.1 >Reporter: Cameron Samak >Priority: Trivial > Attachments: patch0.patch > > > getLiteralClass in ASTNumberLiteral returns Class when it could return > Class > I'd like to extend this to JexlArithmetic.narrowNumber, but that's a > (trivial) breaking change so I left it out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JEXL-239) Add NPE check to property for MapGetExecutor
[ https://issues.apache.org/jira/browse/JEXL-239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155531#comment-16155531 ] Henri Biestro commented on JEXL-239: This occurs in jexl2 and is fixed in jexl3. > Add NPE check to property for MapGetExecutor > > > Key: JEXL-239 > URL: https://issues.apache.org/jira/browse/JEXL-239 > Project: Commons JEXL > Issue Type: Improvement >Affects Versions: 2.1.1 >Reporter: Bruno P. Kinoshita >Priority: Minor > > From GitHub pull request #2 https://github.com/apache/commons-jexl/pull/2 > {quote} > When the MapGetExecutor is init by the key null, the property of > MapGetExecutor will be null. > If the MapGetExecutor is cached and the key is changed (like map[index]), > there will throw an NPE. > I think the intention for the condition is to compare the class > compatibility, so I add the NPE check here. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (JEXL-239) Add NPE check to property for MapGetExecutor
[ https://issues.apache.org/jira/browse/JEXL-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henri Biestro updated JEXL-239: --- Affects Version/s: 2.1.1 > Add NPE check to property for MapGetExecutor > > > Key: JEXL-239 > URL: https://issues.apache.org/jira/browse/JEXL-239 > Project: Commons JEXL > Issue Type: Improvement >Affects Versions: 2.1.1 >Reporter: Bruno P. Kinoshita >Priority: Minor > > From GitHub pull request #2 https://github.com/apache/commons-jexl/pull/2 > {quote} > When the MapGetExecutor is init by the key null, the property of > MapGetExecutor will be null. > If the MapGetExecutor is cached and the key is changed (like map[index]), > there will throw an NPE. > I think the intention for the condition is to compare the class > compatibility, so I add the NPE check here. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)