[
http://jira.codehaus.org/browse/QDOX-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=178873#action_178873
]
Mark Jenner commented on QDOX-82:
---------------------------------
Hi,
I had the same issue parsing something else with StreamTokenizer and I found
your issue when I was searching for solutions. I could not find one so I
cooked up my own and thought you might be interested in applying it to your
problem as well. Basically I needed to parse strings contained in double
quotes, StreamTokenizer does this for you but fails if there is a newline in
the string. So instead of letting StreamTokenizer do the string parsing, I
tell it that double quote is not special and when I get to a one, I reconfigure
the tokenizer into my own "string mode" where the only special chars are double
quote and backslash. When I get to the end of the string I switch back to my
normal tokenizer config (for a format call fvar). Here are the methods:
private void setUpTokenizerForFvar(StreamTokenizer tokenizer) {
// Setup the tokenizer just like a new one as per the
StreamTokenizer constructor comment
tokenizer.resetSyntax();
tokenizer.wordChars((int)'a', (int)'z');
tokenizer.wordChars((int)'A', (int)'Z');
tokenizer.wordChars(128 + 32, 255);
tokenizer.whitespaceChars(0, (int)' ');
tokenizer.commentChar((int)'/');
tokenizer.parseNumbers();
// Attribute names in fvar can include underscores, and spaces!
tokenizer.wordChars(UNDER_SCORE, UNDER_SCORE);
tokenizer.wordChars(SPACE, SPACE);
tokenizer.ordinaryChar(DOUBLE_QUOTE);
}
private void setUpTokenizerForQuotedValue(StreamTokenizer tokenizer) {
// Reset the tokenizer to treat everything as a word except the
double quote char and the escape char
tokenizer.resetSyntax();
tokenizer.wordChars(0, 127);
tokenizer.ordinaryChar(ESCAPE);
tokenizer.ordinaryChar(DOUBLE_QUOTE);
}
// Because StreamTokenizer does not parse quoted strings that contain
newlines properly
// we have to do it ourselves. Reads everything up until a matching
closing quote
// ignoring any that are preceded by an escape char '\'
private String parseQuotedString(int openQuote, StreamTokenizer
tokenizer) {
StringBuilder value = new StringBuilder();
setUpTokenizerForQuotedValue(tokenizer);
def nextToken = tokenizer.nextToken();
boolean escapedQuote = false;
while (escapedQuote || nextToken != openQuote) {
escapedQuote = false;
if (nextToken == StreamTokenizer.TT_WORD) {
value.append(tokenizer.sval);
} else if (nextToken == ESCAPE) {
escapedQuote = true;
value.append((char)nextToken);
} else if (nextToken == openQuote) {
value.append((char)nextToken);
}
nextToken = tokenizer.nextToken();
}
setUpTokenizerForFvar(tokenizer);
return value.toString();
}
used in some code like this:
[...]
nextToken = tokenizer.nextToken();
if (nextToken == DOUBLE_QUOTE) {
String value =
parseQuotedString(nextToken, tokenizer);
[...]
Hope that is some value to you.
> multiline'd tag attribute values not working anymore
> ----------------------------------------------------
>
> Key: QDOX-82
> URL: http://jira.codehaus.org/browse/QDOX-82
> Project: QDox
> Issue Type: Bug
> Components: Parser
> Reporter: Grégory Joseph (old account)
> Fix For: 1.10
>
> Attachments: MultineLineAttributeValuesWithQDoxTestCase.java,
> qdox82-test.patch
>
>
> Some undefined time ago, it was possible to parse the following source
> with qdox and retrieve a sensible value for the "foo" attribute of the
> "bar.baz" tag -> "this is multilined"
> /**
> * @bar.baz foo="this is
> * multilined"
> */
> with the latest snapshot, this unfortunately doesn't work anymore.
> I haven't found an open related jira issue, but before creating one, I
> wanted to make sure this wasn't on purpose ?
> I think allowing this makes sense, for instance for xdoclet because
> some attributes might have longish values, like the "description"
> elements for servlets and such.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email