Re: Full text query in Phoenix

2016-09-19 Thread Cheyenne Forbes
Hi James,

Thanks a lot, I found a link showing how to integrate hbase with lucene
https://itpeernetwork.intel.com/idh-hbase-lucene-integration/


Re: Using COUNT() with columns that don't use COUNT() when the table is join fails

2016-09-19 Thread Steve Terrell
Hi!  I think you need something like
group by u.first_name
on the end.  Best guess.  :)

On Sun, Sep 18, 2016 at 11:03 PM, Cheyenne Forbes <
cheyenne.osanu.for...@gmail.com> wrote:

> this query fails:
>
> SELECT COUNT(fr.friend_1), u.first_name
>>
>> FROM users AS u
>>
>> LEFT JOIN friends AS fr ON u.id = fr.friend_2
>>
>>
> with:
>
> SQLException: ERROR 1018 (42Y27): Aggregate may not contain columns not in
>> GROUP BY. U.FIRST_NAME
>>
>
> TABLES:
>
> users table with these columns ( id, first_name, last_name )
>
>
> friends table with these columns ( friend_1, friend_2 )
>
>
>


Re: using MapReduce java.sql.SQLException: No suitable driver found for jdbc:phoenix occured.

2016-09-19 Thread Josh Elser
Do you have the file "META-INF/services/java.sql.Driver" with (at least) 
the contents "org.apache.phoenix.jdbc.PhoenixDriver" in your custom jar?


It sounds like you incorrectly built the jar. Read up on how Java's 
ServiceLoader works for an explanation as to why your custom jar fails.


Dong-iL, Kim wrote:

Hi.
I created a fat jar with client.jar.
there is thePhoenixDriver class in that fat jar.
and class.forName call has no error.
Can I have other check list?
Regards.


On Sep 15, 2016, at 8:59 AM, Josh Elser  wrote:

phoenix-4.8.0-HBase-1.1-client.jar is the jar which should be used. The 
phoenix-4.8.0-HBase-1.1-hive.jar is to be used with the Hive integration.

dalin.qin wrote:

[root@namenode phoenix]# findjar . org.apache.phoenix.jdbc.PhoenixDriver
Starting search for JAR files from directory .
Looking for the class org.apache.phoenix.jdbc.PhoenixDriver

This might take a while...

./phoenix-4.8.0-HBase-1.1-client.jar
./phoenix-4.8.0-HBase-1.1-server.jar
./phoenix-4.8.0-HBase-1.1-hive.jar
./phoenix-core-4.8.0-HBase-1.1-tests.jar
./phoenix-core-4.8.0-HBase-1.1.jar
./phoenix-core-4.8.0-HBase-1.1-sources.jar

add phoenix-4.8.0-HBase-1.1-client.jar
or phoenix-4.8.0-HBase-1.1-hive.jar(I'm not quite sure which one, you
might try ) to your classpath might solve your problem .

On Tue, Sep 13, 2016 at 7:05 AM, Dong-iL, Kimmailto:kim.s...@gmail.com>>  wrote:

Hi.
I've tested the map reduce code on homepage.
It coudln’t find the jdbc driver as below.
I've insert this code
"Class.forName("org.apache.phoenix.jdbc.PhoenixDriver”);" in mred
main method but there is no effect.
What shall I do?
Regards.


Error: java.lang.RuntimeException: java.sql.SQLException: No
suitable driver found for
jdbc:phoenix:internal.hadoop-master.denma.ggportal.net:2181
:/hbase;
 at

org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:134)
 at

org.apache.phoenix.mapreduce.PhoenixInputFormat.createRecordReader(PhoenixInputFormat.java:71)
 at

org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:524)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.sql.SQLException: No suitable driver found for
jdbc:phoenix:internal.hadoop-master.denma.ggportal.net:2181
:/hbase;
 at java.sql.DriverManager.getConnection(DriverManager.java:689)
 at java.sql.DriverManager.getConnection(DriverManager.java:208)
 at

org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:98)
 at

org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:57)
 at

org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:116)
 ... 9 more






Re: Exception connection to a server with pheonix 4.7 installed

2016-09-19 Thread Josh Elser

Superb. Glad you got it figured out.

Long, Xindian wrote:

Hi, Josh:

Thanks for the suggestion. Adding hbase-site.xml in spark conf directory solves 
the problem

Xindian


-Original Message-
From: Josh Elser [mailto:josh.el...@gmail.com]
Sent: Thursday, September 15, 2016 5:43 PM
To: user@phoenix.apache.org
Subject: Re: Exception connection to a server with pheonix 4.7 installed

Hi Xindian,

A couple of initial things that come to mind...

* Make sure that you're using HDP "bits" (jars) everywhere to remove any 
possibility that there's an issue between what Hortonworks ships and what's in Apache.
* Make sure that your Java application/Spark job has the correct hbase-site.xml 
on the classpath. Both of these require effort on your part to make sure that 
the runtime has them.

Long, Xindian wrote:

Hi:

While I tried to connect to a hbase server with phoenix 4.7 installed
using client with the same version,

I got the following exception:

java.sql.SQLException: ERROR 726 (43M10): Inconsistent namespace
mapping properites

Cannot initiate connection as SYSTEM:CATALOG is found but client does
not have phoenix.schema.isNamespaceMappingEnabled enabled

I checked the server and client, both sides are having the following
options set to true in the hbase-site config:

phoenix.schema.isNamespaceMappingEnabled,

phoenix.schema.mapSystemTablesToNamespace

The attached are the traces and the config screenshots for HBase
client and server . The platform is HDP 2.5 on both sides.

Trace1 is when using Pheonix jdbc driver directly, trace2 is using it
through Spark

Any idea what I should do?

Thanks

Xindian

Server side:

client side:



Re: Using COUNT() with columns that don't use COUNT() when the table is join fails

2016-09-19 Thread Cheyenne Forbes
Hi steve,

Thank you, it works when I add group by, can I avoid using group by or
avoid adding all my columns to the group by if I have 10 columns being
queried?


Re: Using COUNT() with columns that don't use COUNT() when the table is join fails

2016-09-19 Thread Steve Terrell
I'm not an expert in traditional SQL or in Phoenix SQL, but my best guess
is "probably not".

But I'm curious as to why you would like to avoid the group by or the list
of columns.  I know it looks very wordy, but are there any technical
reasons?  In my experience SQL is hard to read by human eyes by nature, so
I just get used to it.

On Mon, Sep 19, 2016 at 10:06 AM, Cheyenne Forbes <
cheyenne.osanu.for...@gmail.com> wrote:

> Hi steve,
>
> Thank you, it works when I add group by, can I avoid using group by or
> avoid adding all my columns to the group by if I have 10 columns being
> queried?
>


Re: Using COUNT() with columns that don't use COUNT() when the table is join fails

2016-09-19 Thread Cheyenne Forbes
I was wondering because it seems extra wordy


How are Dataframes partitioned by default when using spark?

2016-09-19 Thread Long, Xindian
How are Dataframes/Datasets/RDD  partitioned by default when using spark? 
assuming the Dataframe/Datasets/RDD  is the result of a query like that:

select col1, col2, col3 from table3 where col3 > xxx

I noticed that for HBase, a partitioner partitions the rowkeys based on region 
splits,  can Phoenix do this as well?

I also read that if I use spark with the Phoenix jdbc interface "it's only able 
to parallelize queries by partioning on a numeric column. It also requires a 
known lower bound, upper bound and partition count in order to create split 
queries."

Question 1,  If I specify an option like this, is the partitioning based on 
segmenting the range evenly, i.e. each partition gets a rowkey in ranges like: 
upperlimit-lowerlmit)/partitionCount ?

Question 2, if I do not specify any range, or the row key is not a numeric 
column, how is the result partitioned using jdbc?


If I use the spark-phoenix  plug in, it is mentioned that it is able to 
leverage the underlying splits provided by Phoenix?
Are there any example scenarios  of that? e.g. can it partition the resulted 
Dataframe based on regions in the underling HBase table, so that spark can take 
advantage the locality of the data?

Thanks

Xindian


Re: Full text query in Phoenix

2016-09-19 Thread Jean-Marc Spaggiari
HBase + Lily Indexer + SOLR will do that very well. As James said, Phoenix
might not help with the full time. Google for that and you will find many
pointers for web articules or even books.

JMS

2016-09-19 9:05 GMT-04:00 Cheyenne Forbes :

> Hi James,
>
> Thanks a lot, I found a link showing how to integrate hbase with lucene
> https://itpeernetwork.intel.com/idh-hbase-lucene-integration/
>


Re: Using COUNT() with columns that don't use COUNT() when the table is join fails

2016-09-19 Thread Michael McAllister
This is really an ANSI SQL question. If you use an aggregate function, then you 
need to specify what columns to group by. Any columns not being referenced in 
the aggregate function(s) need to be in the GROUP BY statement.

Michael McAllister
Staff Data Warehouse Engineer | Decision Systems
mmcallis...@homeaway.com | C: 512.423.7447 | 
skype: michael.mcallister.ha | webex: 
https://h.a/mikewebex
[cid:image001.png@01D21273.F8F1C960]
This electronic communication (including any attachment) is confidential.  If 
you are not an intended recipient of this communication, please be advised that 
any disclosure, dissemination, distribution, copying or other use of this 
communication or any attachment is strictly prohibited.  If you have received 
this communication in error, please notify the sender immediately by reply 
e-mail and promptly destroy all electronic and printed copies of this 
communication and any attachment.

From: Cheyenne Forbes 
Reply-To: "user@phoenix.apache.org" 
Date: Monday, September 19, 2016 at 10:50 AM
To: "user@phoenix.apache.org" 
Subject: Re: Using COUNT() with columns that don't use COUNT() when the table 
is join fails

I was wondering because it seems extra wordy


Re: Using COUNT() with columns that don't use COUNT() when the table is join fails

2016-09-19 Thread Maryann Xue
Thank you very much for your answer, Michael! Yes, what Cheyenne tried to
use was simply not the right grammar.


Thanks,
Maryann

On Mon, Sep 19, 2016 at 10:47 AM, Michael McAllister <
mmcallis...@homeaway.com> wrote:

> This is really an ANSI SQL question. If you use an aggregate function,
> then you need to specify what columns to group by. Any columns not being
> referenced in the aggregate function(s) need to be in the GROUP BY
> statement.
>
>
>
> Michael McAllister
>
> Staff Data Warehouse Engineer | Decision Systems
>
> mmcallis...@homeaway.com | C: 512.423.7447 | skype: michael.mcallister.ha
>  | webex: https://h.a/mikewebex
>
> This electronic communication (including any attachment) is confidential.
> If you are not an intended recipient of this communication, please be
> advised that any disclosure, dissemination, distribution, copying or other
> use of this communication or any attachment is strictly prohibited.  If you
> have received this communication in error, please notify the sender
> immediately by reply e-mail and promptly destroy all electronic and printed
> copies of this communication and any attachment.
>
>
>
> *From: *Cheyenne Forbes 
> *Reply-To: *"user@phoenix.apache.org" 
> *Date: *Monday, September 19, 2016 at 10:50 AM
> *To: *"user@phoenix.apache.org" 
> *Subject: *Re: Using COUNT() with columns that don't use COUNT() when the
> table is join fails
>
>
>
> I was wondering because it seems extra wordy
>


Re: Phoenix + Spark + JDBC + Kerberos?

2016-09-19 Thread Jean-Marc Spaggiari
Thanks for the pointer to PHOENIX-3189 Josh. I don't think we are facing
that.

We will try to activate the debug mode on Kerberos and retry. Good idea!

I will keep this thread updated if we find something...

JMS

2016-09-15 17:39 GMT-04:00 Josh Elser :

> Cool, thanks for the info, JM. Thinking out loud..
>
> * Could be missing/inaccurate /etc/krb5.conf on the nodes running spark
> tasks
> * Could try setting the Java system property sun.security.krb5.debug=true
> in the Spark executors
> * Could try to set org.apache.hadoop.security=DEBUG in log4j config
>
> Hard to guess at the real issue without knowing more :). Any more context
> you can share, I'd be happy to try to help.
>
> (ps. obligatory warning about PHOENIX-3189 if you're using 4.8.0)
>
> Jean-Marc Spaggiari wrote:
>
>> Using the keytab in the JDBC URL. That the way we use locally and we
>> also tried to run command line applications directly from the worker
>> nodes and it works, But inside the Spark Executor it doesn't...
>>
>> 2016-09-15 13:07 GMT-04:00 Josh Elser > >:
>>
>> How do you expect JDBC on Spark Kerberos authentication to work? Are
>> you using the principal+keytab options in the Phoenix JDBC URL or is
>> Spark itself obtaining a ticket for you (via some "magic")?
>>
>>
>> Jean-Marc Spaggiari wrote:
>>
>> Hi,
>>
>> I tried to build a small app all under Kerberos.
>>
>> JDBC to Phoenix works
>> Client to HBase works
>> Client (puts) on Spark to HBase works.
>> But JDBC on Spark to HBase fails with a message like
>> "GSSException: No
>> valid credentials provided (Mechanism level: Failed to
>> find any Kerberos tgt)]"
>>
>> Keytab is accessible on all the nodes.
>>
>> Keytab belongs to the user running the job, and executors are
>> running
>> under that user name. So this is fine.
>>
>> Any idea of that this might be?
>>
>> Thanks,
>>
>> JMS
>>
>>
>>


Combining an RVC query and a filter on a datatype smaller than 8 bytes causes an Illegal Data Exception

2016-09-19 Thread Kumar Palaniappan
Any one had faced this issue?

https://issues.apache.org/jira/browse/PHOENIX-3297

And this one gives no rows

SELECT * FROM TEST.RVC_TEST WHERE (COLONE, COLTWO) IN (1,2) AND COLTHREE =3
AND COLFOUR=4;


Re: Combining an RVC query and a filter on a datatype smaller than 8 bytes causes an Illegal Data Exception

2016-09-19 Thread Samarth Jain
Kumar,

Can you try with the 4.8 release?



On Mon, Sep 19, 2016 at 2:54 PM, Kumar Palaniappan <
kpalaniap...@marinsoftware.com> wrote:

>
> Any one had faced this issue?
>
> https://issues.apache.org/jira/browse/PHOENIX-3297
>
> And this one gives no rows
>
> SELECT * FROM TEST.RVC_TEST WHERE (COLONE, COLTWO) IN (1,2) AND COLTHREE
> =3 AND COLFOUR=4;
>
>
>
>


Re: Combining an RVC query and a filter on a datatype smaller than 8 bytes causes an Illegal Data Exception

2016-09-19 Thread Kumar Palaniappan
No, I didnt.

But wrapping up with the parenthesis, it worked.

SELECT * FROM TEST.RVC_TEST WHERE (COLONE, COLTWO) IN ((1,2)) AND
COLTHREE=3;

SELECT * FROM TEST.RVC_TEST WHERE ((COLONE, COLTWO) IN ((1,2)) AND
(COLFOUR=4));

On Mon, Sep 19, 2016 at 2:56 PM, Samarth Jain  wrote:

> Kumar,
>
> Can you try with the 4.8 release?
>
>
>
> On Mon, Sep 19, 2016 at 2:54 PM, Kumar Palaniappan <
> kpalaniap...@marinsoftware.com> wrote:
>
>>
>> Any one had faced this issue?
>>
>> https://issues.apache.org/jira/browse/PHOENIX-3297
>>
>> And this one gives no rows
>>
>> SELECT * FROM TEST.RVC_TEST WHERE (COLONE, COLTWO) IN (1,2) AND COLTHREE
>> =3 AND COLFOUR=4;
>>
>>
>>
>>
>


Re: Combining an RVC query and a filter on a datatype smaller than 8 bytes causes an Illegal Data Exception

2016-09-19 Thread Kumar Palaniappan
The problem is when we have just 1 param in the rvc it works.

but this one , for 2+

SELECT * FROM TEST.RVC_TEST WHERE (COLONE, COLTWO) IN ((1,2),(1,2)) AND
COLTHREE=3;

blows up.


On Mon, Sep 19, 2016 at 3:58 PM, Kumar Palaniappan <
kpalaniap...@marinsoftware.com> wrote:

> No, I didnt.
>
> But wrapping up with the parenthesis, it worked.
>
> SELECT * FROM TEST.RVC_TEST WHERE (COLONE, COLTWO) IN ((1,2)) AND
> COLTHREE=3;
>
> SELECT * FROM TEST.RVC_TEST WHERE ((COLONE, COLTWO) IN ((1,2)) AND
> (COLFOUR=4));
>
> On Mon, Sep 19, 2016 at 2:56 PM, Samarth Jain  wrote:
>
>> Kumar,
>>
>> Can you try with the 4.8 release?
>>
>>
>>
>> On Mon, Sep 19, 2016 at 2:54 PM, Kumar Palaniappan <
>> kpalaniap...@marinsoftware.com> wrote:
>>
>>>
>>> Any one had faced this issue?
>>>
>>> https://issues.apache.org/jira/browse/PHOENIX-3297
>>>
>>> And this one gives no rows
>>>
>>> SELECT * FROM TEST.RVC_TEST WHERE (COLONE, COLTWO) IN (1,2) AND COLTHREE
>>> =3 AND COLFOUR=4;
>>>
>>>
>>>
>>>
>>
>