[
https://issues.apache.org/jira/browse/PHOENIX-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963520#comment-13963520
]
Prashant Kommireddi commented on PHOENIX-898:
---------------------------------------------
[~elilevine] there are more bugs with this patch than just the one I pointed
out on PHOENIX-907. Can you please revert this patch and commit PHOENIX-907
first? This will ensure this patch does not break anything.
[~jviolettedsiq] have you tested your patch against a query that does not
specify any upsert columns? It fails due to a few reasons:
# See PHOENIX-907. You should not expect upsert columns to always be present
# columnMetadataList depends on columns being set in conf. This will never
happen if the columns are not specified as part of the query.
Can you please fix the code, test this patch against PHOENIX-907 and make sure
nothing breaks?
> Extend PhoenixHBaseStorage to specify upsert columns
> ----------------------------------------------------
>
> Key: PHOENIX-898
> URL: https://issues.apache.org/jira/browse/PHOENIX-898
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 3.0.0
> Reporter: James Violette
> Fix For: 3.0.0
>
> Attachments: PHOENIX_898_1.patch
>
>
> We have a Phoenix table with data from multiple sources. We would like to
> write a pig script that upserts only data associated with a feed, leaving
> other data alone. The current PhoenixHBaseStorage automatically upserts all
> columns in a table.
> Given this table schema as an example,
> create TABLE IF NOT EXISTS MYSCHEMA.MYTABLE
> (NAME varchar not null
> ,D.INFO VARCHAR
> ,D.D1 DOUBLE
> ,D.I1 INTEGER
> ,D.C1 VARCHAR
> CONSTRAINT pk PRIMARY KEY (NAME));
> Assuming 'A' is loaded into pig,
> The current syntax loads all columns into MYSCHEMA.MYTABLE:
> STORE A into 'hbase://MYSCHEMA.MYTABLE' using
> org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> We could specify upsert columns after the table in the hbase:// url.
> This column-based example is equivalent to the full table upsert.
> STORE A into 'hbase://MYSCHEMA.MYTABLE/NAME,D.INFO,D.D1,D.I1,D.C1' using
> org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> This column-based example chooses to load only three of the five columns.
> STORE A into 'hbase://MYSCHEMA.MYTABLE/NAME,D.INFO,D.I1' using
> org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> This change would touch
> PhoenixHBaseStorage.setStoreLocation - parse the columns
> PhoenixPigConfiguration.configure - add an optional column list parameter.
> PhoenixPigConfiguration.setup - create the upsert statement and create the
> column metadata list
> The rest of the code should work as-is.
--
This message was sent by Atlassian JIRA
(v6.2#6252)