[
https://issues.apache.org/jira/browse/PHOENIX-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959824#comment-13959824
]
ravi commented on PHOENIX-898:
------------------------------
[~jviolettedsiq] , I noticed two test cases failed with the code in the patch.
1) STORE A into 'hbase://MYSCHEMA.MYTABLE' using
org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
Fails with a StringIndexOutOfBoundsException while trying to
strip the trailing token '/' which is absent in this case.
{code}
// strip off the leading path token '/'
String columns = locationURI.getPath().substring(1);
{code}
2) The constructed UPSERT query for a table with default column family .
Excerpt below.
{code}
List<ColumnInfo> columnMetadataList =
PhoenixRuntime.generateColumnInfo(connection, tableName, null);
...
...
String upsertStmt = QueryUtil.constructUpsertStatement(tableName,
columnMetadataList);
{code}
For a table , say , HIRES, whose ddl is
{code}
String ddl = "CREATE TABLE HIRES (id integer not null, name varchar,
location varchar constraint pk primary key(id))";
{code}
the generated UPSERT is
{code}
UPSERT INTO HIRES (ID, 0.NAME, 0.LOCATION) VALUES (?, ?, ?)
{code}
When trying to use the above upsert query for inserts, it fails with
the below exception.
{code}
org.apache.phoenix.exception.PhoenixParserException: ERROR 601 (42P00):
Syntax error. Encountered "." at line 1, column 12.
{code}
Can you please test at your end to see if you notice the same.
> Extend PhoenixHBaseStorage to specify upsert columns
> ----------------------------------------------------
>
> Key: PHOENIX-898
> URL: https://issues.apache.org/jira/browse/PHOENIX-898
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 3.0.0
> Reporter: James Violette
> Fix For: 3.0.0
>
> Attachments: PHOENIX_898_1.patch
>
>
> We have a Phoenix table with data from multiple sources. We would like to
> write a pig script that upserts only data associated with a feed, leaving
> other data alone. The current PhoenixHBaseStorage automatically upserts all
> columns in a table.
> Given this table schema as an example,
> create TABLE IF NOT EXISTS MYSCHEMA.MYTABLE
> (NAME varchar not null
> ,D.INFO VARCHAR
> ,D.D1 DOUBLE
> ,D.I1 INTEGER
> ,D.C1 VARCHAR
> CONSTRAINT pk PRIMARY KEY (NAME));
> Assuming 'A' is loaded into pig,
> The current syntax loads all columns into MYSCHEMA.MYTABLE:
> STORE A into 'hbase://MYSCHEMA.MYTABLE' using
> org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> We could specify upsert columns after the table in the hbase:// url.
> This column-based example is equivalent to the full table upsert.
> STORE A into 'hbase://MYSCHEMA.MYTABLE/NAME,D.INFO,D.D1,D.I1,D.C1' using
> org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> This column-based example chooses to load only three of the five columns.
> STORE A into 'hbase://MYSCHEMA.MYTABLE/NAME,D.INFO,D.I1' using
> org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> This change would touch
> PhoenixHBaseStorage.setStoreLocation - parse the columns
> PhoenixPigConfiguration.configure - add an optional column list parameter.
> PhoenixPigConfiguration.setup - create the upsert statement and create the
> column metadata list
> The rest of the code should work as-is.
--
This message was sent by Atlassian JIRA
(v6.2#6252)