[
https://issues.apache.org/jira/browse/PHOENIX-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13930548#comment-13930548
]
James Taylor commented on PHOENIX-66:
-------------------------------------
bq. The current impl of StringToArrayConverter intentionally doesn't make use
of Connection#createArrayOf to allow it to run without a Connection object,
Use our "connectionless" connection for this. Derive your test from
BaseConnectionlessQueryTest and we'll create a connection like this:
DriverManager.getConnection("jdbc:phoenix:none;test=true") which doesn't
require a connection to the mini cluster. All of the metadata operations (i.e.
CREATE/DROP table) still work, as will all the client-side JDBC stuff. It's
just queries that won't work. It's a good way to write negative tests as well.
bq. The lack of a check for the array delimiter being the same as the field
delimiter is somewhat intentional.
How about we do the following if the field and array delimiters are the same:
1) when we encounter an array in this case, we parse a single field for an
array (or some other default behavior if it's easier to implement), and 2) log
a warning when this is the case that documents (1).
No worries on the formatting/compiler settings, as we don't doc these well. We
use 4 spaces instead of 2, and the two compiler settings that may be different
than your defaults are: flagging the lack of an @Override tag as a warning,
flagging the lack of use of a local and private member variable as a warning
and error respectively.
> Support array creation from CSV file
> ------------------------------------
>
> Key: PHOENIX-66
> URL: https://issues.apache.org/jira/browse/PHOENIX-66
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Fix For: 3.0.0
>
> Attachments: PHOENIX-66-intermediate.patch, PHOENIX-66.patch
>
>
> We should support being able to parse an array defined in our CVS file.
> Perhaps something like this:
> a, b, c, [foo, 1, bar], d
> We'd know (from the data type of the column), that we have an array for the
> fourth field here.
> One option to support this would be to implement the
> PDataType.toObject(String) for the ARRAY PDataType enums. That's not ideal,
> though, as we'd introduce a dependency from PDataType to our CSVLoader, since
> we'd need to in turn parse each element. Also, we don't have a way to pass
> through the custom delimiters that might be in use.
> Another pretty trivial, though a bit more constrained approach would be to
> look at the column ARRAY_SIZE to control how many of the next CSV columns
> should be used as array elements. In this approach, you wouldn't use the
> square brackets at all. You can get the ARRAY_SIZE from the column metadata
> through connection.getMetaData().getColumns() call, through
> resultSet.getInt("ARRAY_SIZE"); However, the ARRAY_SIZE is optional in a DDL
> statement, so we'd need to do something for the case where it's not specified.
> A third option would be to handle most of the parsing in the CSVLoader. We
> could use the above bracket syntax, and then collect up the next set of CSV
> field elements until we hit the unescaped ']'. Then we'd use our standard
> JDBC APIs to build the array and continue on our merry way.
> What do you think, [~jviolettedsiq]? Or [~bruno], maybe you can take a crack
> at it?
--
This message was sent by Atlassian JIRA
(v6.2#6252)