Re: Can phoenix run without hadoop? (hbase standalone)

2016-08-23 Thread Vladimir Rodionov
Can you run HBase w/o Hadoop? Minimum config: HDFS + Yarn (MapReduce2) + HBase + Phoenix You will need HDFS to store data, Yarn to run bulk loads or perform other distributed tasks: export import data mostly, (HBase by itself depends on Yarn/MapReduce) -Vlad On Tue, Aug 23, 2016 at 3:23 PM,

Re: Various performance questions

2016-04-28 Thread Vladimir Rodionov
Probably https://issues.apache.org/jira/browse/HBASE-15654 Vote for it and ask developer to backport it to 1.1.x. -Vlad On Thu, Apr 28, 2016 at 6:14 PM, Michal Medvecky <medve...@pexe.so> wrote: > Yes, it does. > > On Thu, Apr 28, 2016 at 5:58 PM, Vladimir Rodionov <vlad

Re: Various performance questions

2016-04-28 Thread Vladimir Rodionov
>> 2. Is it okay that one regionserver is 10x as loaded as others? Does this server host hbase:meta? -Vlad On Thu, Apr 28, 2016 at 4:14 PM, Michal Medvecky wrote: > Hello, > > I have 7-node hbase cluster (6 regionservers/hdfs nodes, 1 master), with > one big table containing

Re: Adding table compression

2016-03-19 Thread Vladimir Rodionov
Nope, it should be transparent. New data will be compressed on flush and old data will be compressed during next compaction. -Vlad On Fri, Mar 18, 2016 at 12:55 PM, Michael McAllister < mmcallis...@homeaway.com> wrote: > All > > > > Are there any known issues if we use the hbase shell to alter

Re: Number of splits

2016-01-18 Thread Vladimir Rodionov
Number of regions, I presume? You can get this info using standard HBase API. -Vlad On Mon, Jan 18, 2016 at 4:53 AM, Sumit Nigam wrote: > Hi, > > Is there an easy way to know the number of splits a Phoenix table has? > Preferably through JDBC metadata API? > > Thanks, >

Re: help diagnosing issue

2015-09-04 Thread Vladimir Rodionov
t disabled. I thought those were > automatically connected on compactions? > > I will get the stack trace > > On 9/1/15, 3:47 PM, "Vladimir Rodionov" <vladrodio...@gmail.com> wrote: > > >OK, from beginning > > > >1. RegionTooBusy is thrown when Memsto

Re: Rebalancing after adding a new node

2015-09-03 Thread Vladimir Rodionov
HBase does that automatically for you. Regions will be redistributed by HBase balancer and after next major compaction, locality of data will be restored, but ... HBase balancer works on a global level (all tables) and can not rebalance only one table, besides this there is a such a separate beast

Re: help diagnosing issue

2015-09-01 Thread Vladimir Rodionov
OK, from beginning 1. RegionTooBusy is thrown when Memstore size exceeds region flush size X flush multiplier. THIS is a sign of a great imbalance on a write path - some regions are much hotter than other or compaction can not keep up with load , you hit blocking store count and flushes get

Re: taking a backup of a Phoenix database

2015-08-10 Thread Vladimir Rodionov
FYI, https://issues.apache.org/jira/browse/HBASE-14123 introduces incremental backup/restore to HBase tables an is under active development. -Vlad On Mon, Aug 10, 2015 at 12:55 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Except that you have to snapshot EVERYTHING... If you get

Re: RegionServers shutdown randomly

2015-08-06 Thread Vladimir Rodionov
What do DN and NN log say? Do you run any other workload on the same cluster? What is your cluster configuration? Max memory per RS, DN and other collocated processes? -Vlad On Thu, Aug 6, 2015 at 8:42 AM, Adrià Vilà av...@datknosys.com wrote: Hello, HBase RegionServers fail once in a while:

Re: out of memory - unable to create new native thread

2015-07-09 Thread Vladimir Rodionov
There is a good article on the issue: http://javaeesupportpatterns.blogspot.com/2012/09/outofmemoryerror-unable-to-create-new.html -Vlad On Thu, Jul 9, 2015 at 9:38 AM, Ankit Singhal ankitsingha...@gmail.com wrote: Hi Ralph, Try increasing the ulimit for number of open files(ulimit -n) and

Re: HBase + Phoenix for CDR

2015-07-07 Thread Vladimir Rodionov
found any examples of setting TTL in Phoenix. Another question I have is regarding IMMUTABLE_ROWS=true. It’s suggested to use this for append-only table with no updates. What about deletes? Can I use IMMUTABLE_ROWS=true if I delete records from table? On 06 Jul 2015, at 20:32, Vladimir

Re: Data Model Suggestion

2015-06-23 Thread Vladimir Rodionov
From performance point of view, less columns is better when doing scan. The more columns you have in a filter - the more comparisons HBase must perform to decide if it needs skip or include particular cell into a result set. -Vlad On Tue, Jun 23, 2015 at 10:01 AM, James Taylor

Re: Schema and indexes for efficient time range queries

2015-06-08 Thread Vladimir Rodionov
for your input, just some followup questions: 1) When you say try to fit it in long you mean UNSIGNED_LONG from https://phoenix.apache.org/language/datatypes.html right? 2) Would also string format be efficient? Like MMDDHHmm right? Thanks a lot! On 8 June 2015 at 16:48, Vladimir Rodionov

Re: Joins Benchmark

2015-06-03 Thread Vladimir Rodionov
Siva, If your use case require Phoenix large table joins, you better redesign the system: keep real-time data in HBase/Phoenix and have a separate analytical system (Hadoop/Hive) with periodic batch updates from HBase. Its a data duplication, but using ORCFile/Parquet with compression ON will

Re: Socket timeout while counting number of rows of a table

2015-04-09 Thread Vladimir Rodionov
1) Update hbase.rpc.timeout : 120 in client side hbase-site.xml Bad idea. 20 min of timeout? Check RS log files for unusual GC activity (always run hbase with GC stats on). That is probably what is going on in there. On Thu, Apr 9, 2015 at 11:27 AM, Samarth Jain samarth.j...@gmail.com

Re: HBase Cluster Down: No jar path specified for org.apache.hadoop.hbase.regionserver.LocalIndexSplitter

2015-03-05 Thread Vladimir Rodionov
Try the following: Update hbase-site.xml config, set hbase.coprocessor.enabed=false or: hbase.coprocessor.user.enabed=false sync config across cluster. restart the cluster than update your table's settings in hbase shell -Vlad On Thu, Mar 5, 2015 at 3:32 PM, anil gupta

Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-30 Thread Vladimir Rodionov
Or SpliceDB ( not open-source though), but it provides full TX , ANSI SQL-99 support and can run TPCC/TPCH full. Disclaimer: I work for Splice Machine. -Vlad On Fri, Jan 30, 2015 at 8:25 AM, Vladimir Rodionov vladrodio...@gmail.com wrote: I think Phoenix the only option you have. All other

Re: RegionTooBusyException

2014-11-07 Thread Vladimir Rodionov
Thanks, It is for queries only. I do not see how this can help during data loading and index creation. -Vladimir Rodionov On Fri, Nov 7, 2014 at 10:39 AM, James Taylor jamestay...@apache.org wrote: http://phoenix.apache.org/update_statistics.html On Fri, Nov 7, 2014 at 10:36 AM, Vladimir

Re: RegionTooBusyException

2014-11-06 Thread Vladimir Rodionov
first, than limit # of map tasks (if former does not help), than play with compaction config values (above). -Vladimir Rodionov On Thu, Nov 6, 2014 at 12:31 PM, Perko, Ralph J ralph.pe...@pnnl.gov wrote: Hi, I am using a combination of Pig, Phoenix and HBase to load data on a test cluster

Re: RegionTooBusyException

2014-11-06 Thread Vladimir Rodionov
You may want to try different RegionSplitPolicy (ConstantSizeRegionSplitPolicy), default one (IncreasingToUpperBoundRegionSplitPolicy) does not make sense when table is prespit in advance. -Vladimir Rodionov On Thu, Nov 6, 2014 at 1:01 PM, Vladimir Rodionov vladrodio...@gmail.com wrote: Too