I am trying to get the syntax for a load statement in pig to query a string
value. But get a pig parsing error. If I query an integer it is successful. Is
it possible to query with a where clause against a string with Phoenix
pig\hbase integration? Below is the example and error
Example
0: jdbc:phoenix:zookeepernode1.l2lfbpukh1nufb> select * from EXAMPLE;
+------------------------------------------+----------------------------------------------------+----------------------------------------------------+
| MY_PK | FIRST_NAME
| LAST_NAME |
+------------------------------------------+----------------------------------------------------+----------------------------------------------------+
| 825241648 | James
| Dean |
| 825241649 | Susan
| Miller |
| 825241650 | Manish
| Sarni |
| 825241651 | Bev
| Adkins |
| 211295614005 | John
| Doe |
| 232854665520 | Mary
| Poppins |
+------------------------------------------+----------------------------------------------------+----------------------------------------------------+
# start up sqlline.py and create EXAMPLE table and upsert 1 row.
sqlline.py
==========
/usr/hdp/current/phoenix-client/bin$ python
/usr/hdp/current/phoenix-client/bin/sqlline.py
zookeepernode1.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net:2181:/hbase-unsecure
CREATE TABLE example (my_pk bigint not null, m.first_name varchar(50),
m.last_name varchar(50) CONSTRAINT pk PRIMARY KEY (my_pk));
upsert into EXAMPLE values (1, 'John', 'Smith');
SELECT * from EXAMPLE;
# Start up pig (grunt shell)
# Register phoenix client jar
REGISTER /usr/hdp/2.2.7.1-10/phoenix/phoenix-client.jar
# load a CSV file of contacts
raw = load '/data/contacts.txt' using PigStorage(',') as (my_pk, first_name,
last_name);
# Store the data into hbase using phoenix
STORE raw into 'hbase://EXAMPLE' using
org.apache.phoenix.pig.PhoenixHBaseStorage
('zookeepernode0.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode1.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode2.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net:2181:/hbase-unsecure','-batchSize
10-00');
# Load the table out of hbase into pig
T = LOAD 'hbase://table/EXAMPLE' using org.apache.phoenix.pig.PhoenixHBaseLoader
('zookeepernode0.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode1.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode2.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net:2181:/hbase-unsecure');
DUMP T;
# Load data by query out of hbase into pig, successful!
Q1 = LOAD 'hbase://query/SELECT FIRST_NAME, LAST_NAME FROM EXAMPLE WHERE MY_PK
= 825241648' using org.apache.phoenix.pig.PhoenixHBaseLoader
('zookeepernode0.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode1.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode2.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net:2181:/hbase-unsecure');
DUMP Q1;
Q2 = LOAD 'hbase://query/select * from EXAMPLE where FIRST_NAME = 'John';'
using org.apache.phoenix.pig.PhoenixHBaseLoader
('zookeepernode0.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode1.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode2.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net:2181:/hbase-unsecure');
DUMP Q2;
grunt> Q2 = LOAD 'hbase://query/select * from EXAMPLE where FIRST_NAME =
'John';' using org.apache.phoenix.pig.PhoenixHBaseLoader
('zookeepernode0.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode1.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net,zookeepernode2.l2lfbpukh1nufbvgr2lbypaclg.bx.internal.cloudapp.net:2181:/hbase-unsecure');
2015-10-21 16:34:08,911 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
1200: <line 1, column 67> mismatched input 'John' expecting SEMI_COLON
Details at logfile: /home/hdiuser/pig_1445445229498.log
Pig Stack Trace
---------------
ERROR 1200: <line 1, column 67> mismatched input 'John' expecting SEMI_COLON
Failed to parse: <line 1, column 67> mismatched input 'John' expecting
SEMI_COLON
at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:244)
at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:182)
at
org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1707)
at
org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1680)
at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1063)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:558)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
Bill