[
https://issues.apache.org/jira/browse/HIVE-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276587#comment-13276587
]
Phabricator commented on HIVE-1719:
-----------------------------------
cwsteinbach has requested changes to the revision "HIVE-1719 [jira] Move
RegexSerDe out of hive-contrib and over to hive-serde".
INLINE COMMENTS
serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java:45 Please remove
"It can also serialize the row object using a format string."
serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java:42 Please change
to "RegexSerDe uses a regular expression to deserialize row data. It does not
support data serialization."
serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java:93 Please remove
the outputFormatString variable and just do:
if (null != tbl.getProperty("output.format.string")) {
LOG.warn(....);
}
serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java:102 Throw the
exception here instead of in the next IF block.
serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java:186 if
(!alreadyLoggedNoMatch) {
...
}
serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java:158 Adding
"Count" to the end of this variable name would help to make it clear that this
is a scalar variable as opposed to a collection type.
serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java:184 Does this
logic do anything? It looks like the code will log the first unmatched row it
finds, set alreadyLoggedNoMatch = true, and subsequently never log another
warning. Is there any reason to call getNextNumberToDisplay()?
serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java:205 I think it's
fine to log one warning for the first row that has partial matches. No need to
get fancy with this exponential backoff logging strategy.
REVISION DETAIL
https://reviews.facebook.net/D3249
BRANCH
HIVE-1719-trunk
To: JIRA, cwsteinbach, shreepadma
> Move RegexSerDe out of hive-contrib and over to hive-serde
> ----------------------------------------------------------
>
> Key: HIVE-1719
> URL: https://issues.apache.org/jira/browse/HIVE-1719
> Project: Hive
> Issue Type: Task
> Components: Serializers/Deserializers
> Reporter: Carl Steinbach
> Assignee: Shreepadma Venugopalan
> Attachments: HIVE-1719.3.patch.txt, HIVE-1719.D3051.1.patch,
> HIVE-1719.D3051.2.patch, HIVE-1719.D3141.1.patch, HIVE-1719.D3249.1.patch,
> HIVE-1719.D3249.1.patch, HIVE-1719.patch
>
>
> RegexSerDe is as much a part of the standard Hive distribution as the other
> SerDes
> currently in hive-serde. I think we should move it over to the hive-serde
> module so that
> users don't have to go to the added effort of manually registering the
> contrib jar before
> using it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira