Re: ABORTING region server and following HBase cluster "crash"

2018-11-02 Thread Neelesh
By no means am I judging Phoenix based on this. This is simply a design
trade-off (scylladb goes the same route and builds global indexes). I
appreciate all the effort that has gone in to Phoenix, and it was indeed a
life saver. But the technical point remains that single node failures have
potential to cascade to the entire cluster. That's the nature of global
indexes, not specific to phoenix.

I apologize if my response came off as dismissing phoenix altogether. FWIW,
I'm a big advocate of phoenix at my org internally, albeit for the newer
version.


On Fri, Nov 2, 2018, 4:09 PM Josh Elser  wrote:

> I would strongly disagree with the assertion that this is some
> unavoidable problem. Yes, an inverted index is a data structure which,
> by design, creates a hotspot (phrased another way, this is "data
> locality").
>
> Lots of extremely smart individuals have spent a significant amount of
> time and effort in stabilizing secondary indexes in the past 1-2 years,
> not to mention others spending time on a local index implementation.
> Judging Phoenix in its entirety based off of an arbitrarily old version
> of Phoenix is disingenuous.
>
> On 11/2/18 2:00 PM, Neelesh wrote:
> > I think this is an unavoidable problem in some sense, if global indexes
> > are used. Essentially global indexes create a  graph of dependent region
> > servers due to index rpc calls from one RS to another. Any single
> > failure is bound to affect the entire graph, which under reasonable load
> > becomes the entire HBase cluster. We had to drop global indexes just to
> > keep the cluster running for more than a few days.
> >
> > I think Cassandra has local secondary indexes preciesly because of this
> > issue. Last I checked there were significant pending improvements
> > required for Phoenix local indexes, especially around read paths ( not
> > utilizing primary key prefixes in secondary index reads where possible,
> > for example)
> >
> >
> > On Thu, Sep 13, 2018, 8:12 PM Jonathan Leech  > <mailto:jonat...@gmail.com>> wrote:
> >
> > This seems similar to a failure scenario I’ve seen a couple times. I
> > believe after multiple restarts you got lucky and tables were
> > brought up by Hbase in the correct order.
> >
> > What happens is some kind of semi-catastrophic failure where 1 or
> > more region servers go down with edits that weren’t flushed, and are
> > only in the WAL. These edits belong to regions whose tables have
> > secondary indexes. Hbase wants to replay the WAL before bringing up
> > the region server. Phoenix wants to talk to the index region during
> > this, but can’t. It fails enough times then stops.
> >
> > The more region servers / tables / indexes affected, the more likely
> > that a full restart will get stuck in a classic deadlock. A good
> > old-fashioned data center outage is a great way to get started with
> > this kind of problem. You might make some progress and get stuck
> > again, or restart number N might get those index regions initialized
> > before the main table.
> >
> > The sure fire way to recover a cluster in this condition is to
> > strategically disable all the tables that are failing to come up.
> > You can do this from the Hbase shell as long as the master is
> > running. If I remember right, it’s a pain since the disable command
> > will hang. You might need to disable a table, kill the shell,
> > disable the next table, etc. Then restart. You’ll eventually have a
> > cluster with all the region servers finally started, and a bunch of
> > disabled regions. If you disabled index tables, enable one, wait for
> > it to become available; eg its WAL edits will be replayed, then
> > enable the associated main table and wait for it to come online. If
> > Hbase did it’s job without error, and your failure didn’t include
> > losing 4 disks at once, order will be restored. Lather, rinse,
> > repeat until everything is enabled and online.
> >
> >  A big enough failure sprinkled with a little bit of bad luck
> > and what seems to be a Phoenix flaw == deadlock trying to get HBASE
> > to start up. Fix by forcing the order that Hbase brings regions
> > online. Finally, never go full restart. 
> >
> >  > On Sep 10, 2018, at 7:30 PM, Batyrshin Alexander
> > <0x62...@gmail.com <mailto:0x62...@gmail.com>> wrote:
> >  >
> >  > After update web interface at Master show that every region
> > server now 1.4.7 and no RITS.
> >

Re: ABORTING region server and following HBase cluster "crash"

2018-11-02 Thread Neelesh
I think this is an unavoidable problem in some sense, if global indexes are
used. Essentially global indexes create a  graph of dependent region
servers due to index rpc calls from one RS to another. Any single failure
is bound to affect the entire graph, which under reasonable load becomes
the entire HBase cluster. We had to drop global indexes just to keep the
cluster running for more than a few days.

I think Cassandra has local secondary indexes preciesly because of this
issue. Last I checked there were significant pending improvements required
for Phoenix local indexes, especially around read paths ( not utilizing
primary key prefixes in secondary index reads where possible, for example)


On Thu, Sep 13, 2018, 8:12 PM Jonathan Leech  wrote:

> This seems similar to a failure scenario I’ve seen a couple times. I
> believe after multiple restarts you got lucky and tables were brought up by
> Hbase in the correct order.
>
> What happens is some kind of semi-catastrophic failure where 1 or more
> region servers go down with edits that weren’t flushed, and are only in the
> WAL. These edits belong to regions whose tables have secondary indexes.
> Hbase wants to replay the WAL before bringing up the region server. Phoenix
> wants to talk to the index region during this, but can’t. It fails enough
> times then stops.
>
> The more region servers / tables / indexes affected, the more likely that
> a full restart will get stuck in a classic deadlock. A good old-fashioned
> data center outage is a great way to get started with this kind of problem.
> You might make some progress and get stuck again, or restart number N might
> get those index regions initialized before the main table.
>
> The sure fire way to recover a cluster in this condition is to
> strategically disable all the tables that are failing to come up. You can
> do this from the Hbase shell as long as the master is running. If I
> remember right, it’s a pain since the disable command will hang. You might
> need to disable a table, kill the shell, disable the next table, etc. Then
> restart. You’ll eventually have a cluster with all the region servers
> finally started, and a bunch of disabled regions. If you disabled index
> tables, enable one, wait for it to become available; eg its WAL edits will
> be replayed, then enable the associated main table and wait for it to come
> online. If Hbase did it’s job without error, and your failure didn’t
> include losing 4 disks at once, order will be restored. Lather, rinse,
> repeat until everything is enabled and online.
>
>  A big enough failure sprinkled with a little bit of bad luck and
> what seems to be a Phoenix flaw == deadlock trying to get HBASE to start
> up. Fix by forcing the order that Hbase brings regions online. Finally,
> never go full restart. 
>
> > On Sep 10, 2018, at 7:30 PM, Batyrshin Alexander <0x62...@gmail.com>
> wrote:
> >
> > After update web interface at Master show that every region server now
> 1.4.7 and no RITS.
> >
> > Cluster recovered only when we restart all regions servers 4 times...
> >
> >> On 11 Sep 2018, at 04:08, Josh Elser  wrote:
> >>
> >> Did you update the HBase jars on all RegionServers?
> >>
> >> Make sure that you have all of the Regions assigned (no RITs). There
> could be a pretty simple explanation as to why the index can't be written
> to.
> >>
> >>> On 9/9/18 3:46 PM, Batyrshin Alexander wrote:
> >>> Correct me if im wrong.
> >>> But looks like if you have A and B region server that has index and
> primary table then possible situation like this.
> >>> A and B under writes on table with indexes
> >>> A - crash
> >>> B failed on index update because A is not operating then B starting
> aborting
> >>> A after restart try to rebuild index from WAL but B at this time is
> aborting then A starting aborting too
> >>> From this moment nothing happens (0 requests to region servers) and A
> and B is not responsible from Master-status web interface
>  On 9 Sep 2018, at 04:38, Batyrshin Alexander <0x62...@gmail.com
> > wrote:
> 
>  After update we still can't recover HBase cluster. Our region servers
> ABORTING over and over:
> 
>  prod003:
>  Sep 09 02:51:27 prod003 hbase[1440]: 2018-09-09 02:51:27,395 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=92,queue=2,port=60020]
> regionserver.HRegionServer: ABORTING region server
> prod003,60020,1536446665703: Could not update the index table, killing
> server region because couldn't write to an index table
>  Sep 09 02:51:27 prod003 hbase[1440]: 2018-09-09 02:51:27,395 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=77,queue=7,port=60020]
> regionserver.HRegionServer: ABORTING region server
> prod003,60020,1536446665703: Could not update the index table, killing
> server region because couldn't write to an index table
>  Sep 09 02:52:19 prod003 hbase[1440]: 2018-09-09 02:52:19,224 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=82,queue=2,port=60020]
> 

Re: Phoenix Performances & Uses Cases

2018-11-02 Thread Neelesh
Another observation with Phoenix global indexes - at very large volumes of
writes, a single region server failure cascades to the entire cluster very
quickly

On Sat, Oct 27, 2018, 4:50 AM Nicolas Paris 
wrote:

> Hi
>
> I am benchmarking phoenix to better understand its strength and
> weaknesses. My basis is to compare to postgresql for OLTP workload and
> hive llap for OLAP workload. I am testing on a 10 computer cluster
> instance with hive (2.1) and phoenix (4.8)  220 GO RAM/32CPU versus a
> postgresql (9.6) 128GO RAM 32CPU.
>
> Right now, my opinion is:
> - when getting a subset on a large table, phoenix performs the
>   best
> - when getting a subset from multiple large tables, postgres performs
>   the best
> - when getting a subset from a large table joining one to many small
>   table, phoenix performs the best
> - when ingesting high frequency data, Phoenix performs the best
> - when grouping by query, hive > postgresql > phoenix
> - when windowning, transforming, grouping, hive performs the best,
>   phoenix the worst
>
> Finally, my conclusion is  phoenix is not intended at all for analytics
> queries such grouping, windowing, and joining large tables. It suits
> well for very specific use case like maintaining a very large table with
> eventually small tables to join with (such timeseries data, or binary
> storage data with hbase MOB enabled).
>
> Am I missing something ?
>
> Thanks,
>
> --
> nicolas
>


Re: How do local indexes work?

2017-06-29 Thread Neelesh
Thanks for the slides, Rajesh Babu.

Does this mean any read path will have to scan all regions of a table? Is
there an optimization available if the primary key and the index share a
common prefix, thus reducing the number of regions to look at?

Thanks again!


On Jun 29, 2017 7:24 PM, "rajeshb...@apache.org" <chrajeshbab...@gmail.com>
wrote:

9,10 slides gives details how read path works.

https://www.slideshare.net/rajeshbabuchintaguntla/local-
secondary-indexes-in-apache-phoenix

Let's know if you need more information.

Thanks,
Rajeshbabu.

On Fri, Jun 30, 2017 at 4:20 AM, Neelesh <neele...@gmail.com> wrote:

> Hi,
>The documentation says  - "From 4.8.0 onwards we are storing all local
> index data in the separate shadow column families in the same data table".
>
> It is not quite clear to me how the read path works with local indexes. Is
> there any document that has some details on how it works ? PHOENIX-1734 has
> some (shadow CFs), but not enough.
>
> Any pointers are appreciated!
>
> Thanks
>
>


How do local indexes work?

2017-06-29 Thread Neelesh
Hi,
   The documentation says  - "From 4.8.0 onwards we are storing all local
index data in the separate shadow column families in the same data table".

It is not quite clear to me how the read path works with local indexes. Is
there any document that has some details on how it works ? PHOENIX-1734 has
some (shadow CFs), but not enough.

Any pointers are appreciated!

Thanks


Re: Global Indexes and impact on availability

2016-12-05 Thread Neelesh
Local indexes would indeed solve this problem, at the cost of some penalty
at read-time.  Unfortunately, our vendor distribution (HortonWorks) still
does not have all the bug fixes required for local indexes to work in a
production setting. They consider local indexes to be still in beta and are
explicit about not using local indexes yet.

I was interested in seeing if anyone in the community has experienced
similar issues around global indexes

On Mon, Dec 5, 2016 at 2:39 PM, James Taylor <jamestay...@apache.org> wrote:

> Have you tried local indexes?
>
> On Mon, Dec 5, 2016 at 2:35 PM Neelesh <neele...@gmail.com> wrote:
>
>> Hello,
>>   When a region server is under stress (hotspotting, or large
>> replication, call queue sizes hitting the limit, other processes competing
>> with HBase etc), we experience latency spikes for all regions hosted by
>> that region server.  This is somewhat expected in the plain HBase world.
>>
>> However, with a phoenix global index, this service deterioration seems to
>> propagate to a lot more region servers, since the affected RS hosts some
>> index regions. The actual data regions are on another RS and latencies on
>> that RS spike because it cannot complete the index update calls quickly.
>> And that second RS now causes issues on yet another one and so on.
>>
>> We've seen this happen on our cluster, and how we deal with this is by
>> "fixing" the original RS - split regions/restart/move around regions,
>> depending on what the problem is.
>>
>> Has any one experienced this issue? It feels like antithetical behavior
>> for a distributed system. Cluster breaking down for the the very reasons
>> its supposed to protect against.
>>
>> Love to hear the thoughts of Phoenix community on this
>>
>


Global Indexes and impact on availability

2016-12-05 Thread Neelesh
Hello,
  When a region server is under stress (hotspotting, or large replication,
call queue sizes hitting the limit, other processes competing with HBase
etc), we experience latency spikes for all regions hosted by that region
server.  This is somewhat expected in the plain HBase world.

However, with a phoenix global index, this service deterioration seems to
propagate to a lot more region servers, since the affected RS hosts some
index regions. The actual data regions are on another RS and latencies on
that RS spike because it cannot complete the index update calls quickly.
And that second RS now causes issues on yet another one and so on.

We've seen this happen on our cluster, and how we deal with this is by
"fixing" the original RS - split regions/restart/move around regions,
depending on what the problem is.

Has any one experienced this issue? It feels like antithetical behavior for
a distributed system. Cluster breaking down for the the very reasons its
supposed to protect against.

Love to hear the thoughts of Phoenix community on this


Unable to find cached index metadata

2016-11-26 Thread Neelesh
Hi All,
  we are using phoenix 4.4 with HBase 1.1.2 (HortonWorks distribution).
We're struggling with the following error on pretty much all our region
servers. The indexes are global, the data table has more than a 100B rows

2016-11-26 12:15:41,250 INFO
 [RW.default.writeRpcServer.handler=40,queue=6,port=16020]
util.IndexManagementUtil: Rethrowing
org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR
2008 (INT10): Unable to find cached index metadata.
 key=7015231383024113337 region=,-056946674
   ,1477336770695.07d70ebd63f737a62e24387cf0912af5. Index
update failed

I looked at https://issues.apache.org/jira/browse/PHOENIX-1718  and bumped
up the settings mentioned there to 1 hour


phoenix.coprocessor.maxServerCacheTimeToLiveMs
360


phoenix.coprocessor.maxMetaDataCacheTimeToLiveMs
360


but to no avail.

Any help is appreciated!

Thanks!


Re: Spark & Phoenix data load

2016-04-10 Thread Neelesh
Thanks Josh. I looked at the code as well and you are right.  It would've
been great to disconnect the core bulkloader logic from CSV. That would
make more direct bulkload integrations possible. Hopefully I'll get to that
one of these days.
On Apr 10, 2016 11:52 AM, "Josh Mahonin" <jmaho...@gmail.com> wrote:

Hi Neelesh,

The saveToPhoenix method uses the MapReduce PhoenixOutputFormat under the
hood, which is a wrapper over the JDBC driver. It's likely not as efficient
as the CSVBulkLoader, although there are performance improvements over a
simple JDBC client as the writes are spread across multiple Spark workers
(depending on the number of partitions in the RDD/DataFrame).

Regards,

Josh

On Sun, Apr 10, 2016 at 1:21 AM, Neelesh <neele...@gmail.com> wrote:

> Hi ,
>   Does phoenix-spark's saveToPhoenix use the JDBC driver internally, or
> does it do something similar to CSVBulkLoader using HFiles?
>
> Thanks!
>
>


Spark & Phoenix data load

2016-04-09 Thread Neelesh
Hi ,
  Does phoenix-spark's saveToPhoenix use the JDBC driver internally, or
does it do something similar to CSVBulkLoader using HFiles?

Thanks!


Re: ERROR 2008 (INT10): Unable to find cached index metadata.

2016-02-17 Thread Neelesh
Also, was your change to phoenix.upsert.batch.size on the client or on the
region server or both?

On Wed, Feb 17, 2016 at 2:57 PM, Neelesh <neele...@gmail.com> wrote:

> Thanks Anil. We've upped phoenix.coprocessor.maxServerCacheTimeToLiveMs,
> but haven't tried playing with phoenix.upsert.batch.size. Its at the
> default 1000.
>
> On Wed, Feb 17, 2016 at 12:48 PM, anil gupta <anilgupt...@gmail.com>
> wrote:
>
>> I think, this has been answered before:
>> http://search-hadoop.com/m/9UY0h2FKuo8RfAPN
>>
>> Please let us know if the problem still persists.
>>
>> On Wed, Feb 17, 2016 at 12:02 PM, Neelesh <neele...@gmail.com> wrote:
>>
>>> We've been running phoenix 4.4 client for a while now with HBase 1.1.2.
>>> Once in a while while UPSERTing records (on a table with 2 global indexes),
>>> we see the following error.  I found
>>> https://issues.apache.org/jira/browse/PHOENIX-1718 and upped both
>>> values in that JIRA to 360. This still does not help and we keep
>>> seeing this once in a while. What's not clear is also if this setting is
>>> relevant for client or just the server.
>>>
>>> Any help is appreciated
>>>
>>> org.apache.phoenix.execute.CommitException: java.sql.SQLException: ERROR 
>>> 2008 (INT10): Unable to find cached index metadata.  ERROR 2008 (INT10): 
>>> ERROR 2008 (INT10): Unable to find cached index metadata.  
>>> key=5115312427460709976 region=TEST_TABLE,111-222-950835849 
>>>  ,1455513914764.48b2157bcdac165898983437c1801ea7. Index update 
>>> failed
>>> at 
>>> org.apache.phoenix.execute.MutationState.commit(MutationState.java:444) 
>>> ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>>> at 
>>> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:459)
>>>  ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>>> at 
>>> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:456)
>>>  ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) 
>>> ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>>> at 
>>> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:456)
>>>  ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Anil Gupta
>>
>
>


Re: ERROR 2008 (INT10): Unable to find cached index metadata.

2016-02-17 Thread Neelesh
Thanks Anil. We've upped phoenix.coprocessor.maxServerCacheTimeToLiveMs,
but haven't tried playing with phoenix.upsert.batch.size. Its at the
default 1000.

On Wed, Feb 17, 2016 at 12:48 PM, anil gupta <anilgupt...@gmail.com> wrote:

> I think, this has been answered before:
> http://search-hadoop.com/m/9UY0h2FKuo8RfAPN
>
> Please let us know if the problem still persists.
>
> On Wed, Feb 17, 2016 at 12:02 PM, Neelesh <neele...@gmail.com> wrote:
>
>> We've been running phoenix 4.4 client for a while now with HBase 1.1.2.
>> Once in a while while UPSERTing records (on a table with 2 global indexes),
>> we see the following error.  I found
>> https://issues.apache.org/jira/browse/PHOENIX-1718 and upped both values
>> in that JIRA to 360. This still does not help and we keep seeing
>> this once in a while. What's not clear is also if this setting is relevant
>> for client or just the server.
>>
>> Any help is appreciated
>>
>> org.apache.phoenix.execute.CommitException: java.sql.SQLException: ERROR 
>> 2008 (INT10): Unable to find cached index metadata.  ERROR 2008 (INT10): 
>> ERROR 2008 (INT10): Unable to find cached index metadata.  
>> key=5115312427460709976 region=TEST_TABLE,111-222-950835849  
>> ,1455513914764.48b2157bcdac165898983437c1801ea7. Index update 
>> failed
>> at 
>> org.apache.phoenix.execute.MutationState.commit(MutationState.java:444) 
>> ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>> at 
>> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:459) 
>> ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>> at 
>> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:456) 
>> ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) 
>> ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>> at 
>> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:456) 
>> ~[phoenix-client-4.4.0-HBase-1.1.jar:4.4.0-HBase-1.1]
>>
>>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>