Re: sc.phoenixTableAsRDD number of initial partitions

2016-10-13 Thread Ciureanu Constantin
Hi Antonio,
Reading the whole table is not a good use-case for Phoenix / HBase or any
DB.
You should never ever store the whole content read from DB / disk into
memory, that's definitely wrong.
Spark doesn't do that by itself, no matter what "they" told you that it's
going to do in order to be faster bla bla. Review your algorithm and see
what's to improve, After all, I hope you just use collect() so the OOM is
on the driver (that's easier to fix, :p by not using it).
Back to the OOM: After reading an RDD you can shuffle yourself /
repartition in any number of partitions easily (but that sends data through
network so it's expensive):
repartition(numPartitions)
http://spark.apache.org/docs/latest/programming-guide.html
I recommend to read this plus a few articles on Spark best practices.

Kind regards,
Constantin

În Joi, 13 oct. 2016, 18:16 Antonio Murgia,  a scris:

> Hello everyone,
>
> I'm trying to read data from a Phoenix Table using apache Spark. I
> actually use the suggested method: sc.phoenixTableAsRDD without issuing
> any query (e.g. reading the whole table) and I noticed that the number
> of partitions that spark creates is equal to the number of
> regionServers. Is there a way to use a custom number of regions?
>
> The problem we actually face is that if a region is bigger than the
> available memory of the spark executor, it goes in OOM. Being able to
> tune the number of regions, we might use a higher number of partitions
> reducing the memory footprint of the processing (and also slowing it
> down, i know :( ).
>
> Thank you in advance
>
> #A.M.
>
>


Ordering of numbers generated by a sequence

2016-10-13 Thread F21

I am using Phoenix 4.8.1 with HBase 1.2.3 and the Phoenix Query Server.

I want to use a sequence to generate a monotonically increasing id for 
each row. Since the documentation states that 100 sequence numbers are 
cached by default in the client (in my case, I assume the caching would 
happen in the query server), what is the behavior if I have 2 query 
servers (load-balanced)? Does this mean Server A would generate numbers 
starting from 0, and Server B would generate numbers starting from 100? 
I need to make sure that the id is in order on a global basis for the 
whole table. Would setting the CACHE to 0 be the best of achieving this?


Also, as the ID is monotonically increasing, I plan to salt the table 
using something like (no manual split points):


CREATE TABLE mytable (id BIGINT NOT NULL PRIMARY KEY,  VARCHAR) 
SALT_BUCKETS = 20;


Without setting phoenix.query.rowKeyOrderSaltedTable to true, would I 
still be able to get my records in order if I select them using 
something like this?


SELECT * FROM mytable WHERE id > 5 && id < 100 ORDER BY id

Thanks,

Francis



Re: Creating secondary index on Phoenix view on Hbase table throws error

2016-10-13 Thread Mich Talebzadeh
This works when the index is created in phoenix server as opposed to
phoenix client.

Thanks

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 12 October 2016 at 23:38, Mich Talebzadeh 
wrote:

> Yes correct that is hbase-site.xml Ted.
>
> i am running Hbase in standalone mode. Do I need region server?
>
> thx
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 12 October 2016 at 23:25, Ted Yu  wrote:
>
>> bq. my h-base-site.xml
>>
>> Seems to be typo above - did you mean hbase-site.xml ?
>>
>> Have you checked every region server w.r.t. the value
>> for hbase.regionserver.wal.codec ?
>>
>> Cheers
>>
>> On Wed, Oct 12, 2016 at 3:22 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> In the following "marketDataHbase" is a view on Hbase table.
>>>
>>> This is my h-base-site.xml (running Hbase on standalone mode)
>>>
>>> 
>>>  hbase.defaults.for.version.skip
>>>  true
>>> 
>>> 
>>>  hbase.regionserver.wal.codec
>>>  org.apache.hadoop.hbase.regionserver.wal.IndexedWALEd
>>> itCodec
>>> 
>>> 
>>>   hbase.region.server.rpc.scheduler.factory.class
>>>   org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory
>>>   Factory to create the Phoenix RPC Scheduler that uses
>>> separate queues for index and metadata updates
>>> 
>>> 
>>>   hbase.rpc.controllerfactory.class
>>>   org.apache.hadoop.hbase.ipc.controller.ServerRpcContr
>>> ollerFactory
>>>   Factory to create the Phoenix RPC Scheduler that uses
>>> separate queues for index and metadata updates
>>> 
>>> 
>>>  phoenix.functions.allowUserDefinedFunctions
>>>  true
>>>  enable UDF functions
>>> 
>>>
>>> and I have restarted Hbase but still getting the below error!
>>> 0: jdbc:phoenix:thin:url=http://rhes564:8765> create index ticker_index
>>> on "marketDataHbase" ("ticker");
>>> Error: Error -1 (0) : Error while executing SQL "create index
>>> ticker_index on "marketDataHbase" ("ticker")": Remote driver error:
>>> RuntimeException: java.sql.SQLException: ERROR 1029 (42Y88): Mutable
>>> secondary indexes must have the hbase.regionserver.wal.codec property set
>>> to org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec in the
>>> hbase-sites.xml of every region server. tableName=TICKER_INDEX ->
>>> SQLException: ERROR 1029 (42Y88): Mutable secondary indexes must have the
>>> hbase.regionserver.wal.codec property set to 
>>> org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
>>> in the hbase-sites.xml of every region server. tableName=TICKER_INDEX
>>> (state=0,code=-1)
>>>
>>> Thanks
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>
>>
>


Re: PrepareAndExecute statement return only 100 rows

2016-10-13 Thread Josh Elser

Hi Puneeth,

What version of Phoenix are you using?

Indeed per [1], maxRowCount should control the number of rows returned 
in the ExecuteResponse. However, given that you see 100 rows (which is 
the default), it sounds like the value is not being respected. The most 
recent docs may not align with the version of code you're running.


Unless you can guarantee that you never see more than a few hundred 
rows, it is likely not a good idea to request all of the rows in one 
request (use the FetchRequest to get subsequent batches).


- Josh

[1] 
http://calcite.apache.org/avatica/docs/json_reference.html#prepareandexecuterequest


Puneeth Prasad wrote:

Hi,

PrepareAndExecute statement has a default limit of returning 100 rows.
To avoid that, we use maxRowCount = -1, but it still gives only 100 rows.

I've copied the PHP code below, the highlighted part is the necessary
change to fetch all the rows possible. Can you please suggest where
we've gone wrong and how to correct it? Is there something pretty
obvious we missed out here?

curl_setopt($ch, CURLOPT_URL, "http://ip.address.of.phoenix.server:8765/
");

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt($ch, CURLOPT_POST, 1);

$headers[] = "Request:
{\"request\":\"openConnection\",\"connectionId\":\"00---\"}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);

$headers[] = "Request:
{\"request\":\"createStatement\",\"connectionId\":\"00---\"}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);

$headers[] = "Request:
{\"request\":\"prepareAndExecute\",\"connectionId\":\"00---\",\"statementId\":
".$a.",\"sql\": \"SELECT * FROM TABLE_NAME\",*\"maxRowCount\":-1*}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);

$headers[] = "Request:
{\"request\":\"closeStatement\",\"connectionId\":\"00---\",\"statementId\":
1}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);

$headers[] = "Request:
{\"request\":\"closeConnection\",\"connectionId\":\"00---\"}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);

Thanks!

Puneeth



PrepareAndExecute statement return only 100 rows

2016-10-13 Thread Puneeth Prasad
Hi,

 

PrepareAndExecute statement has a default limit of returning 100 rows. To
avoid that, we use maxRowCount = -1, but it still gives only 100 rows. 

 

 I've copied the PHP code below, the highlighted part is the necessary
change to fetch all the rows possible. Can you please suggest where we've
gone wrong and how to correct it? Is there something pretty obvious we
missed out here?

 

 

curl_setopt($ch, CURLOPT_URL, "http://ip.address.of.phoenix.server:8765/
 ");

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt($ch, CURLOPT_POST, 1);

 

$headers[] = "Request:
{\"request\":\"openConnection\",\"connectionId\":\"00---
\"}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

 

$result = curl_exec($ch);

 

$headers[] = "Request:
{\"request\":\"createStatement\",\"connectionId\":\"00---000
0\"}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

 

$result = curl_exec($ch);

 

$headers[] = "Request:
{\"request\":\"prepareAndExecute\",\"connectionId\":\"00---0
000\",\"statementId\": ".$a.",\"sql\": \"SELECT * FROM
TABLE_NAME\",\"maxRowCount\":-1}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

 

$result = curl_exec($ch);

 

$headers[] = "Request:
{\"request\":\"closeStatement\",\"connectionId\":\"00---
\",\"statementId\": 1}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

 

$result = curl_exec($ch);

 

$headers[] = "Request:
{\"request\":\"closeConnection\",\"connectionId\":\"00---000
0\"}";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

 

$result = curl_exec($ch);

 

 

Thanks!

Puneeth

 



sc.phoenixTableAsRDD number of initial partitions

2016-10-13 Thread Antonio Murgia

Hello everyone,

I'm trying to read data from a Phoenix Table using apache Spark. I 
actually use the suggested method: sc.phoenixTableAsRDD without issuing 
any query (e.g. reading the whole table) and I noticed that the number 
of partitions that spark creates is equal to the number of 
regionServers. Is there a way to use a custom number of regions?


The problem we actually face is that if a region is bigger than the 
available memory of the spark executor, it goes in OOM. Being able to 
tune the number of regions, we might use a higher number of partitions 
reducing the memory footprint of the processing (and also slowing it 
down, i know :( ).


Thank you in advance

#A.M.



Re: Bulk dataload and dynamic columns

2016-10-13 Thread Sanooj Padmakumar
Thanks for the confirmation Anil.

On Fri, Oct 7, 2016 at 11:22 PM, anil gupta  wrote:

> I dont think that feature is supported yet in bulk load tool.
>
> On Thu, Oct 6, 2016 at 9:55 PM, Sanooj Padmakumar 
> wrote:
>
>> Hi All,
>>
>> Can we populate dynamic columns as well while bulk loading data (
>> https://phoenix.apache.org/bulk_dataload.html) into Hbase using Phoenix
>> ?  It didn't work when we tried and hence posting it here. Thanks in advance
>>
>> --
>> Thanks,
>> Sanooj Padmakumar
>>
>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>



-- 
Thanks,
Sanooj Padmakumar


Re: Views and alter table

2016-10-13 Thread Sanooj Padmakumar
Hi James,

I managed to reproduce this.

create table test1(key1 varchar not null,key2 varchar not null,"p".val1
varchar,CONSTRAINT pk PRIMARY KEY(key1,key2)) Compression = 'SNAPPY'

ALTER TABLE test1 ADD "p".val2 varchar,"p".val3 varchar,"p".val4 varchar

create view test_view1(key2 varchar,val1 varchar, val2 varchar,val3
varchar,val4 varchar) AS select * from test1 where key1='key1'

ALTER TABLE test1 ADD "p".val5 varchar

After the last step I get
Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot drop column
referenced by VIEW columnName=TEST1
SQLState:  42M01
ErrorCode: 1010

I was trying to create the table with a java program and that time it wasnt
reproducible today I tried the steps with Squirrel client and I could
reproduce this. (Hope this helps)

We are using 4.5.2 of Phoenix.

Thanks
Sanooj

On Fri, Oct 7, 2016 at 8:32 PM, James Taylor  wrote:

> Hi Sanjooj,
> What version of Phoenix? Would you mind filing a JIRA with steps to
> reproduce the issue?
> Thanks,
> James
>
>
> On Friday, October 7, 2016, Sanooj Padmakumar  wrote:
>
>> Hi All
>>
>> We get mutation state related error when we try altering a table to which
>> views are added. We always have to drop the view before doing the alter. Is
>> there a way we can avoid this?
>>
>> Thanks
>> Sanooj
>>
>


-- 
Thanks,
Sanooj Padmakumar


Re: How and where can I get help to set up my "phoenix cluster" for production?

2016-10-13 Thread Ted Yu
Hortonworks does offer support. 

> On Oct 13, 2016, at 5:40 AM, Antonio Murgia  wrote:
> 
> As far as I know, cloudera let's you install Phoenix through a Parcel, for 
> free. But they do not offer support for Phoenix.
> 
>> On 10/13/2016 01:38 PM, Cheyenne Forbes wrote:
>> Thats the question I shouldve asked myself, no
>> 
>> How can I get it done paid?
> 


Re: How and where can I get help to set up my "phoenix cluster" for production?

2016-10-13 Thread Antonio Murgia
As far as I know, cloudera let's you install Phoenix through a Parcel, 
for free. But they do not offer support for Phoenix.



On 10/13/2016 01:38 PM, Cheyenne Forbes wrote:

Thats the question I shouldve asked myself, no

How can I get it done paid?




Re: How and where can I get help to set up my "phoenix cluster" for production?

2016-10-13 Thread Cheyenne Forbes
Thats the question I shouldve asked myself, no

How can I get it done paid?


Re: How and where can I get help to set up my "phoenix cluster" for production?

2016-10-13 Thread Ted Yu
If there're people who do this for free, would you trust them ?

> On Oct 13, 2016, at 4:30 AM, Cheyenne Forbes 
>  wrote:
> 
> Are there people who do this for free?


How and where can I get help to set up my "phoenix cluster" for production?

2016-10-13 Thread Cheyenne Forbes
 Are there people who do this for free?


Hbase throttling and Phoenix issue

2016-10-13 Thread Sumit Nigam
Hi,
I am trying to use hbase throttling feature with Phoenix. Hbase is 1.1.2 and 
phoenix 4.6. 
When I specify big number of SALT_BUCKETS, the hbase throws ThrottlingException 
even when quotas are high. Please note that this error occurs only when we scan 
from phoenix shell. From hbase shell, the same table scan goes through fine. 

The following properties were set in hbase server 
hbase.quota.enabled=true 
hbase.quota.refresh.period=5000 


1. Login to Phoenix Shell and create a table 
create table "abc" (id bigint not null primary key, name varchar) 
salt_buckets=50; 

2. Once the table is created, apply the following quota through Hbase Shell:
set_quota TYPE => THROTTLE, TABLE => 'abc', LIMIT => '10G/sec' 

wait for 5 seconds for quota to be refreshed (also ran list_quotas to make sure 
that quota was applied).

3. Run the following query in phoenix shell on this empty table:
select * from "abc"; 


Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.quotas.ThrottlingException):
 org.apache.hadoop.hbase.quotas.ThrottlingException: read size limit exceeded - 
wait 0.00sec 
at 
org.apache.hadoop.hbase.quotas.ThrottlingException.throwThrottlingException(ThrottlingException.java:107)
 
at 
org.apache.hadoop.hbase.quotas.ThrottlingException.throwReadSizeExceeded(ThrottlingException.java:101)
 
at 
org.apache.hadoop.hbase.quotas.TimeBasedLimiter.checkQuota(TimeBasedLimiter.java:139)
 
at 
org.apache.hadoop.hbase.quotas.DefaultOperationQuota.checkQuota(DefaultOperationQuota.java:59)
 
at 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(RegionServerQuotaManager.java:180)
 
at 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(RegionServerQuotaManager.java:125)
 
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2300)
 
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32295)
 
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2127) 
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) 
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) 
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) 
at java.lang.Thread.run(Thread.java:745) 

at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1225) 
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
 
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
 
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32741)
 
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable. 


Thanks, 
Sumit

Can phoenix support HBase's TimeStamp?

2016-10-13 Thread Yang Zhang
Hello everyone

I saw that we can create a Phoenix table from an exist HBase table,(for
detail

)
My question is whether Phoenix can supprort the history version of my row?

I am trying to  use Phoenix to store some info which have a lot of common
columns,
such as a table "T1 ( c1, c2, c3, c4 )", many rows share the same
c1,c2,c3,and the variable column is c4,
Using HBase we can put  'T1',  'key1', ' f:c4', 'new value', timestamp,

And i can get previous version of this row,They all share the same c1,c2,c3
whice HBase only store once.

Whether phoenix support to query history version of my row?

I got this jira link   ,
This is same as my question.

Hadoop is using for big data, and mlutiple version can help us reduce our
date that unnecessary
I think phoenix should support this feature too.

If Phoenix shouldn't support multiple version, please tell me the reason.


Anyway thansks for your help, First


Re: Region start row and end row

2016-10-13 Thread Anil
HI Cheyenne*,*

Thank you very much.

Load cannot be done in parallel with one jdbc connection. To make it
parallel, each node must read a set of records

Following is my approach.

1. Create Cluster wide singleton distributed custom service

2. Get all region(s) information (for each records has be to read) in the
init() method of custom service

3. Broadcast region(s) using ignite.compute().call() in execute() method of
custom service. so that each node reads a region data.

4. Scan a particular region (with start row and end row) using scan query
and load into cache


Hope this give clear idea.


Please let me know if you have any questions.


Thanks.




On 13 October 2016 at 13:34, Cheyenne Forbes <
cheyenne.osanu.for...@gmail.com> wrote:

> Check out this post for loading data from MySQL to Ignite
> https://dzone.com/articles/apache-ignite-how-to-read-
> data-from-persistent-sto
>
> and this one (recommended) on how to UPSERT to Phoenix on Ignite PUT...
> *delete, etc.*
> https://apacheignite.readme.io/docs/persistent-store#cachestore-example
>
> Just replace the MySQL things with Phoenix things (eg. the JDBC driver,
> INSERT to UPSERT, etc.). If after reading you still have issues, feel free
> ask in this thread for more help
>


Re: Region start row and end row

2016-10-13 Thread Cheyenne Forbes
Check out this post for loading data from MySQL to Ignite
https://dzone.com/articles/apache-ignite-how-to-read-data-from-persistent-sto

and this one (recommended) on how to UPSERT to Phoenix on Ignite PUT...
*delete, etc.*
https://apacheignite.readme.io/docs/persistent-store#cachestore-example

Just replace the MySQL things with Phoenix things (eg. the JDBC driver,
INSERT to UPSERT, etc.). If after reading you still have issues, feel free
ask in this thread for more help