[jira] [Commented] (PHOENIX-1395) ResultSpooler spill files are left behind in /tmp folder
[ https://issues.apache.org/jira/browse/PHOENIX-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029801#comment-16029801 ] Enis Soztutar commented on PHOENIX-1395: bq. on phoenix 4.10 ; seeing around 30G of "00c53f2c-9c3c-4f61-98ef-30977edbaf825826236490782363992.tmp" and "ResultSpooler*.bin" files! Are active queries going on or these are left overs similar to the ones in the jira description. You can check whether these files are in use (by the region server or the phoenix client) by doing an lsof. bq. Is it safe to delete them? Should be safe if no application has these files for reading or writing. However, if that is the case, I would suggest to open another jira to track the bug since this one is pretty old. > ResultSpooler spill files are left behind in /tmp folder > > > Key: PHOENIX-1395 > URL: https://issues.apache.org/jira/browse/PHOENIX-1395 > Project: Phoenix > Issue Type: Bug >Reporter: Jeffrey Zhong >Assignee: Alicia Ying Shu > Fix For: 4.3.0, 3.3.0 > > Attachments: PHOENIX-1395.patch, PHOENIX-1395.v1.patch > > > Recently we found that some ResultSpooler*.bin files left in the tmp folder. > I think those are due to some client code doesn't call close on the returned > Resultset(which internally will invoke underlying > OnDiskResultIterator.close()) or client code get killed during result > iterating. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3789) Execute cross region index maintenance calls in postBatchMutateIndispensably
[ https://issues.apache.org/jira/browse/PHOENIX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008820#comment-16008820 ] Enis Soztutar commented on PHOENIX-3789: Thanks James, good to know. Agreed that this is way more scalable, but I was more concerned about the fact that the design for the secondary indexes was such that the index update will be visible before the data table update. If is it not a problem in terms of "correctness", then this approach is better. > Execute cross region index maintenance calls in postBatchMutateIndispensably > > > Key: PHOENIX-3789 > URL: https://issues.apache.org/jira/browse/PHOENIX-3789 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor > Fix For: 4.11.0 > > Attachments: PHOENIX-3789_addendum1.patch, > PHOENIX-3789_addendum2.patch, PHOENIX-3789.patch, PHOENIX-3789_v2.patch > > > Making cross region server calls while the row is locked can lead to a > greater chance of resource starvation. We can use the > postBatchMutateIndispensably hook instead of the postBatchMutate call for our > processing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3789) Execute cross region index maintenance calls in postBatchMutateIndispensably
[ https://issues.apache.org/jira/browse/PHOENIX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008548#comment-16008548 ] Enis Soztutar commented on PHOENIX-3789: bq. This earlier behavior was not scalable. Fully agreed. We have seen this in a few busy clusters that whole cluster grind to a halt when one particular region server is slow or unresponsive. We were discussing this patch internally, because in theory it helps a lot in scaling the index updates. My only concern is that with this patch, we are changing the visibility semantics of the data table update versus index table update. In the original design, the data table update was not visible until the index update was visible. This patch changes the logic completely that data table update is visible without even the index RPCs are scheduled. I do no know enough details to see whether this will be semantically correct in the consistency guarantees with the current design. Are we changing those guarantees safely? > Execute cross region index maintenance calls in postBatchMutateIndispensably > > > Key: PHOENIX-3789 > URL: https://issues.apache.org/jira/browse/PHOENIX-3789 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor > Fix For: 4.11.0 > > Attachments: PHOENIX-3789_addendum1.patch, > PHOENIX-3789_addendum2.patch, PHOENIX-3789.patch, PHOENIX-3789_v2.patch > > > Making cross region server calls while the row is locked can lead to a > greater chance of resource starvation. We can use the > postBatchMutateIndispensably hook instead of the postBatchMutate call for our > processing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3360) Secondary index configuration is wrong
[ https://issues.apache.org/jira/browse/PHOENIX-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15967271#comment-15967271 ] Enis Soztutar commented on PHOENIX-3360: bq. just want to confirm that we still need the RPC scheduler configured as we document here We do not need that config anymore. Let's update the documentation. bq. confusing that it's in org.apache.hadoop.hbase.ipc.controller package I think it was done that way due to some internals in RpcController being package-protected. We should remove {{hbase.rpc.controllerfactory.class}}, but keep {{hbase.region.server.rpc.scheduler.factory.class}}. > Secondary index configuration is wrong > -- > > Key: PHOENIX-3360 > URL: https://issues.apache.org/jira/browse/PHOENIX-3360 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: William Yang >Priority: Critical > Fix For: 4.10.0 > > Attachments: ConfCP.java, PHOENIX-3360.patch, PHOENIX-3360-v2.PATCH, > PHOENIX-3360-v3.PATCH, PHOENIX-3360-v4.PATCH > > > IndexRpcScheduler allocates some handler threads and uses a higher priority > for RPCs. The corresponding IndexRpcController is not used by default as it > is, but used through ServerRpcControllerFactory that we configure from Ambari > by default which sets the priority of the outgoing RPCs to either metadata > priority, or the index priority. > However, after reading code of IndexRpcController / ServerRpcController it > seems that the IndexRPCController DOES NOT look at whether the outgoing RPC > is for an Index table or not. It just sets ALL rpc priorities to be the index > priority. The intention seems to be the case that ONLY on servers, we > configure ServerRpcControllerFactory, and with clients we NEVER configure > ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We > configure ServerRpcControllerFactory from Ambari, which in affect makes it so > that ALL rpcs from Phoenix are only handled by the index handlers by default. > It means all deadlock cases are still there. > The documentation in https://phoenix.apache.org/secondary_indexing.html is > also wrong in this sense. It does not talk about server side / client side. > Plus this way of configuring different values is not how HBase configuration > is deployed. We cannot have the configuration show the > ServerRpcControllerFactory even only for server nodes, because the clients > running on those nodes will also see the wrong values. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3062) JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang
[ https://issues.apache.org/jira/browse/PHOENIX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15941374#comment-15941374 ] Enis Soztutar commented on PHOENIX-3062: Checked the patch, seems the right approach. You should also delete PhoenixMetricsSink since it is not needed anymore. Also rename TraceMetricSource because it is not a MetricSource anymore. There is a couple of debug statements lying around. > JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT > to hang > > > Key: PHOENIX-3062 > URL: https://issues.apache.org/jira/browse/PHOENIX-3062 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Karan Mehta > Fix For: 4.11.0 > > Attachments: phoenix-3062_v1.patch, PHOENIX-3062_v2.patch > > > With some recent fixes in the hbase metrics system, we are now affectively > restarting the metrics system (in HBase-1.3.0, probably not affecting 1.2.0). > Since we use a custom sink in the PhoenixTracingEndToEndIT, restarting the > metrics system loses the registered sink thus causing a hang. > We need a fix in HBase, and Phoenix so that we will not restart the metrics > during tests. > Thanks to [~sergey.soldatov] for analyzing the initial root cause of the > hang. > See HBASE-14166 and others. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (PHOENIX-3707) PhoenixTracingEndToEndIT is hanging on master branch
[ https://issues.apache.org/jira/browse/PHOENIX-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar resolved PHOENIX-3707. Resolution: Duplicate Re-resolving as duplicate. > PhoenixTracingEndToEndIT is hanging on master branch > > > Key: PHOENIX-3707 > URL: https://issues.apache.org/jira/browse/PHOENIX-3707 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Samarth Jain >Priority: Blocker > > https://builds.apache.org/job/Phoenix-master/1574/console -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Reopened] (PHOENIX-3707) PhoenixTracingEndToEndIT is hanging on master branch
[ https://issues.apache.org/jira/browse/PHOENIX-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar reopened PHOENIX-3707: > PhoenixTracingEndToEndIT is hanging on master branch > > > Key: PHOENIX-3707 > URL: https://issues.apache.org/jira/browse/PHOENIX-3707 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Samarth Jain >Priority: Blocker > > https://builds.apache.org/job/Phoenix-master/1574/console -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3062) JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang
[ https://issues.apache.org/jira/browse/PHOENIX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895292#comment-15895292 ] Enis Soztutar commented on PHOENIX-3062: {{TraceMetricSource}} javadoc explains some, but from what I remember, the htrace works by sending all the traces to the configured {{SpanReceiver}}. So all of the hdfs + hbase and phoenix traces go to the same SpanReceiver. {{TraceMetricSource}} implements the SpanReceiver, and forwards the spans to the metrics system. The {{PhoenixMetricsSink}} periodically runs via the metrics subsystem, and gets the buffered traces via the getMetrics() call. Then it issues the Phoenix writes. As long as we still implement the SpanReceiver, the metrics will be collected from all sources (hdfs,hbase,phoenix). We just need to remove the metrics dependency by forking a scheduled thread for the {{PhoenixMetricsSink}}, and also put a limited buffered queue or something where the traces will be dropped if we cannot keep up. Should be an easy patch. > JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT > to hang > > > Key: PHOENIX-3062 > URL: https://issues.apache.org/jira/browse/PHOENIX-3062 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.10.0 > > Attachments: phoenix-3062_v1.patch > > > With some recent fixes in the hbase metrics system, we are now affectively > restarting the metrics system (in HBase-1.3.0, probably not affecting 1.2.0). > Since we use a custom sink in the PhoenixTracingEndToEndIT, restarting the > metrics system loses the registered sink thus causing a hang. > We need a fix in HBase, and Phoenix so that we will not restart the metrics > during tests. > Thanks to [~sergey.soldatov] for analyzing the initial root cause of the > hang. > See HBASE-14166 and others. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3062) JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang
[ https://issues.apache.org/jira/browse/PHOENIX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893536#comment-15893536 ] Enis Soztutar commented on PHOENIX-3062: When the metrics system gets restarted, it picks the new sinks from the properties file from the metrics system. I am not sure whether the tracing system is registered normally in that file. If it is, this is not a production issue then. Otherwise, you are right. On a separate note, tracing system piggy-backing on the metrics subsystem is not right. A simple bounded queue with a separate daemon thread would have solve the tracing use case. I think we should fix that for the long term. > JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT > to hang > > > Key: PHOENIX-3062 > URL: https://issues.apache.org/jira/browse/PHOENIX-3062 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.10.0 > > Attachments: phoenix-3062_v1.patch > > > With some recent fixes in the hbase metrics system, we are now affectively > restarting the metrics system (in HBase-1.3.0, probably not affecting 1.2.0). > Since we use a custom sink in the PhoenixTracingEndToEndIT, restarting the > metrics system loses the registered sink thus causing a hang. > We need a fix in HBase, and Phoenix so that we will not restart the metrics > during tests. > Thanks to [~sergey.soldatov] for analyzing the initial root cause of the > hang. > See HBASE-14166 and others. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3360) Secondary index configuration is wrong
[ https://issues.apache.org/jira/browse/PHOENIX-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866815#comment-15866815 ] Enis Soztutar commented on PHOENIX-3360: bq. we can use the v1 patch with a little modification that we just set the conf returned by CoprocessorEnvironment#getConfiguration() Indeed. The patch should have read the env.getConfiguration() as opposed to env.getRegionServerServices().getConfiguration(). 1+ for v4. [~rajeshbabu] do you mind committing this. > Secondary index configuration is wrong > -- > > Key: PHOENIX-3360 > URL: https://issues.apache.org/jira/browse/PHOENIX-3360 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: William Yang >Priority: Critical > Fix For: 4.10.0 > > Attachments: ConfCP.java, PHOENIX-3360.patch, PHOENIX-3360-v2.PATCH, > PHOENIX-3360-v3.PATCH, PHOENIX-3360-v4.PATCH > > > IndexRpcScheduler allocates some handler threads and uses a higher priority > for RPCs. The corresponding IndexRpcController is not used by default as it > is, but used through ServerRpcControllerFactory that we configure from Ambari > by default which sets the priority of the outgoing RPCs to either metadata > priority, or the index priority. > However, after reading code of IndexRpcController / ServerRpcController it > seems that the IndexRPCController DOES NOT look at whether the outgoing RPC > is for an Index table or not. It just sets ALL rpc priorities to be the index > priority. The intention seems to be the case that ONLY on servers, we > configure ServerRpcControllerFactory, and with clients we NEVER configure > ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We > configure ServerRpcControllerFactory from Ambari, which in affect makes it so > that ALL rpcs from Phoenix are only handled by the index handlers by default. > It means all deadlock cases are still there. > The documentation in https://phoenix.apache.org/secondary_indexing.html is > also wrong in this sense. It does not talk about server side / client side. > Plus this way of configuring different values is not how HBase configuration > is deployed. We cannot have the configuration show the > ServerRpcControllerFactory even only for server nodes, because the clients > running on those nodes will also see the wrong values. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3360) Secondary index configuration is wrong
[ https://issues.apache.org/jira/browse/PHOENIX-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15862162#comment-15862162 ] Enis Soztutar commented on PHOENIX-3360: bq. If we set an RS level RPC config, then all replication requests will be handled by the index handlers in those RSs instead of the normal handlers. This is a good point. However, I just checked the code. The Configuration object that is passed to the coprocessor via the coprocessor environment is the region's configuration object (HRegion.conf). That configuration is a CompoundConfiguration which is a super imposed configuration from region server + HTD.getConfig() + HCD.getConfig(). CompoundConfiguration treats the added configs as immutable, and has an internal mutable config (see the code). This means that with the original patch, the rest of region server (including replication) will not be affected. Although I did not verify this by testing, so if anybody has the time, verifying this would be great. The problem with v3 is that {{CoprocessorHConnection}} is internal to HBase. We should not use that in Phoenix. > Secondary index configuration is wrong > -- > > Key: PHOENIX-3360 > URL: https://issues.apache.org/jira/browse/PHOENIX-3360 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Rajeshbabu Chintaguntla >Priority: Critical > Fix For: 4.10.0 > > Attachments: PHOENIX-3360.patch, PHOENIX-3360-v2.PATCH, > PHOENIX-3360-v3.PATCH > > > IndexRpcScheduler allocates some handler threads and uses a higher priority > for RPCs. The corresponding IndexRpcController is not used by default as it > is, but used through ServerRpcControllerFactory that we configure from Ambari > by default which sets the priority of the outgoing RPCs to either metadata > priority, or the index priority. > However, after reading code of IndexRpcController / ServerRpcController it > seems that the IndexRPCController DOES NOT look at whether the outgoing RPC > is for an Index table or not. It just sets ALL rpc priorities to be the index > priority. The intention seems to be the case that ONLY on servers, we > configure ServerRpcControllerFactory, and with clients we NEVER configure > ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We > configure ServerRpcControllerFactory from Ambari, which in affect makes it so > that ALL rpcs from Phoenix are only handled by the index handlers by default. > It means all deadlock cases are still there. > The documentation in https://phoenix.apache.org/secondary_indexing.html is > also wrong in this sense. It does not talk about server side / client side. > Plus this way of configuring different values is not how HBase configuration > is deployed. We cannot have the configuration show the > ServerRpcControllerFactory even only for server nodes, because the clients > running on those nodes will also see the wrong values. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3661) Make phoenix tool select file system dynamically
[ https://issues.apache.org/jira/browse/PHOENIX-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861957#comment-15861957 ] Enis Soztutar commented on PHOENIX-3661: LGTM. > Make phoenix tool select file system dynamically > > > Key: PHOENIX-3661 > URL: https://issues.apache.org/jira/browse/PHOENIX-3661 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.7.0, 4.8.0 >Reporter: Yishan Yang > Attachments: phoenix-3661-1.patch > > > Phoenix indexing tool assume that the root directory is the default Hadoop > FileSystem. With this patch, > phoenix index tool will get file system dynamically which will prevent “Wrong > FileSystem” errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3360) Secondary index configuration is wrong
[ https://issues.apache.org/jira/browse/PHOENIX-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858369#comment-15858369 ] Enis Soztutar commented on PHOENIX-3360: bq. When ever one of the phoenix table region opened in the RS we are setting the property with ServerRpcControllerFactory It is the Indexer setting the configuration. Even if there are no tables with index, do we still instantiate the Indexer? bq. With William Yang's patch the short-circuit write optimization will not work because the connection created is not a coprocessor connection. That is a good point. We need the short circuit'ing logic, otherwise it is another cause for deadlocking the threads. > Secondary index configuration is wrong > -- > > Key: PHOENIX-3360 > URL: https://issues.apache.org/jira/browse/PHOENIX-3360 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Rajeshbabu Chintaguntla >Priority: Critical > Fix For: 4.10.0 > > Attachments: PHOENIX-3360.patch, PHOENIX-3360-v2.PATCH > > > IndexRpcScheduler allocates some handler threads and uses a higher priority > for RPCs. The corresponding IndexRpcController is not used by default as it > is, but used through ServerRpcControllerFactory that we configure from Ambari > by default which sets the priority of the outgoing RPCs to either metadata > priority, or the index priority. > However, after reading code of IndexRpcController / ServerRpcController it > seems that the IndexRPCController DOES NOT look at whether the outgoing RPC > is for an Index table or not. It just sets ALL rpc priorities to be the index > priority. The intention seems to be the case that ONLY on servers, we > configure ServerRpcControllerFactory, and with clients we NEVER configure > ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We > configure ServerRpcControllerFactory from Ambari, which in affect makes it so > that ALL rpcs from Phoenix are only handled by the index handlers by default. > It means all deadlock cases are still there. > The documentation in https://phoenix.apache.org/secondary_indexing.html is > also wrong in this sense. It does not talk about server side / client side. > Plus this way of configuring different values is not how HBase configuration > is deployed. We cannot have the configuration show the > ServerRpcControllerFactory even only for server nodes, because the clients > running on those nodes will also see the wrong values. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3360) Secondary index configuration is wrong
[ https://issues.apache.org/jira/browse/PHOENIX-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856801#comment-15856801 ] Enis Soztutar commented on PHOENIX-3360: Thanks [~yhxx511]. [~rajeshbabu] did you take a look at William's v2 patch? bq. I've read PHOENIX-3271, unfortunately, this patch will break the assumption that all cross RS calls are made with higher priority, as we didn't change the coprocessor env's configuration, we just created a new hbase connection with specific configurations. I think [~an...@apache.org] was saying that the priorities are set correct for all RS to RS communication, but looking at the patch for this, it does not seem to be the case (the indexer changes the regionserver's configuration though). bq. The optimal way to make things right is to use separate handler queues for read and write. Sharing the same handler queue with index requests might not be the best idea Here (https://issues.apache.org/jira/browse/PHOENIX-3271?focusedCommentId=15836531=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15836531), I was arguing that we should instead do a "task pool" for Phoenix, and execute the upsert-select like requests there. Indeed we cannot rely on the user to configure read/write pools. > Secondary index configuration is wrong > -- > > Key: PHOENIX-3360 > URL: https://issues.apache.org/jira/browse/PHOENIX-3360 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Rajeshbabu Chintaguntla >Priority: Critical > Fix For: 4.10.0 > > Attachments: PHOENIX-3360.patch, PHOENIX-3360-v2.PATCH > > > IndexRpcScheduler allocates some handler threads and uses a higher priority > for RPCs. The corresponding IndexRpcController is not used by default as it > is, but used through ServerRpcControllerFactory that we configure from Ambari > by default which sets the priority of the outgoing RPCs to either metadata > priority, or the index priority. > However, after reading code of IndexRpcController / ServerRpcController it > seems that the IndexRPCController DOES NOT look at whether the outgoing RPC > is for an Index table or not. It just sets ALL rpc priorities to be the index > priority. The intention seems to be the case that ONLY on servers, we > configure ServerRpcControllerFactory, and with clients we NEVER configure > ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We > configure ServerRpcControllerFactory from Ambari, which in affect makes it so > that ALL rpcs from Phoenix are only handled by the index handlers by default. > It means all deadlock cases are still there. > The documentation in https://phoenix.apache.org/secondary_indexing.html is > also wrong in this sense. It does not talk about server side / client side. > Plus this way of configuring different values is not how HBase configuration > is deployed. We cannot have the configuration show the > ServerRpcControllerFactory even only for server nodes, because the clients > running on those nodes will also see the wrong values. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PHOENIX-3271) Distribute UPSERT SELECT across cluster
[ https://issues.apache.org/jira/browse/PHOENIX-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836531#comment-15836531 ] Enis Soztutar commented on PHOENIX-3271: This is a nice improvement. [~rajeshbabu] has a patch which changes the rpc scheduler to be configured programmatically from the server side, related to PHOENIX-3360. Do we need that patch in before this? For this patch in general, depending on the handler priorities proved to be brittle, however this will work if we confirm that the index rpc handlers will be used in all cross-RS communication. Agreed that we have to fix documentation, and also rename "index" handlers. For the long term, I would rather have another approach, where the Phoenix Rpc scheduler has a different thread pool (with low priority) to execute generic "tasks". In this case, the scan fragments will be executed from that task thread pool, but the upsert writes go to the normal thread pool. Doing these scans with inserts kind of thing should not be piggy-backed on the scan flow I think. One other thing is that these scan RPCs will take a longer time and will timeout, and retried from the client, causing the worst case behavior to be pretty bad user experience. Do we have any plans for dealing with that? On the newer HBase's the scanner has heartbeats and can return earlier close to the scanner lease timeout. Does it apply for these upsert selects? Maybe we should add a safe-guard configuration in case, larger clusters cannot execute the scan fragments under rpc timeout. wdyt? > Distribute UPSERT SELECT across cluster > --- > > Key: PHOENIX-3271 > URL: https://issues.apache.org/jira/browse/PHOENIX-3271 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Ankit Singhal > Fix For: 4.10.0 > > Attachments: PHOENIX-3271.patch, PHOENIX-3271_v1.patch, > PHOENIX-3271_v2.patch, PHOENIX-3271_v3.patch, PHOENIX-3271_v4.patch, > PHOENIX-3271_v5.patch > > > Based on some informal testing we've done, it seems that creation of a local > index is orders of magnitude faster that creation of global indexes (17 > seconds versus 10-20 minutes - though more data is written in the global > index case). Under the covers, a global index is created through the running > of an UPSERT SELECT. Also, UPSERT SELECT provides an easy way of copying a > table. In both of these cases, the data being upserted must all flow back to > the same client which can become a bottleneck for a large table. Instead, > what can be done is to push each separate, chunked UPSERT SELECT call out to > a different region server for execution there. One way we could implement > this would be to have an endpoint coprocessor push the chunked UPSERT SELECT > out to each region server and return the number of rows that were upserted > back to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2565) Store data for immutable tables in single KeyValue
[ https://issues.apache.org/jira/browse/PHOENIX-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800011#comment-15800011 ] Enis Soztutar commented on PHOENIX-2565: Thanks, I was checking the code in the branch before 01ef5d. Here: https://github.com/apache/phoenix/blob/encodecolumns2/phoenix-core/src/main/java/org/apache/phoenix/schema/types/PArrayDataType.java#L1299 we are still serializing the nulls, no? bq. For example, if column 1 is set and column 102 is set, we're storing offsets for column2 through column 101. We could instead introduce a bit set that tracks if a value is set For doing nulls in Avro, you do a union of the type with the Null type, so all nullable fields are encoded like {{}}. So avro has to spend 1 byte per nullable field, regardless of whether the field is there or not. PB has a different model, where each type is prefixed with the id of the field, which also means that if the field is not there it is null. So, the cost is 1 varint per field that is not-null (as opposed to per field in the schema). Obviously what is optimal depends on average whether there is a lot of null-fields in the data or not. The cost of doing a bitset for nullability fields would be 1 byte per 8 "declared" fields (regardless of whether there is null or not). If there is a single null field, we are saving 2 or 4 bytes (for the offset). So if on average, we expect the data to have at least 1 null per 16 columns or so it looks like a good idea to implement this. > Store data for immutable tables in single KeyValue > -- > > Key: PHOENIX-2565 > URL: https://issues.apache.org/jira/browse/PHOENIX-2565 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Thomas D'Silva > Attachments: PHOENIX-2565-v2.patch, PHOENIX-2565-wip.patch, > PHOENIX-2565.patch > > > Since an immutable table (i.e. declared with IMMUTABLE_ROWS=true) will never > update a column value, it'd be more efficient to store all column values for > a row in a single KeyValue. We could use the existing format we have for > variable length arrays. > For backward compatibility, we'd need to support the current mechanism. Also, > you'd no longer be allowed to transition an existing table to/from being > immutable. I think the best approach would be to introduce a new IMMUTABLE > keyword and use it like this: > {code} > CREATE IMMUTABLE TABLE ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2565) Store data for immutable tables in single KeyValue
[ https://issues.apache.org/jira/browse/PHOENIX-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799857#comment-15799857 ] Enis Soztutar commented on PHOENIX-2565: bq. the format you outlined is the format of the single key value format. It's very similar to the array format with the differences being: Thanks. Reading the code of PArrayDataType and and looking at the serialized data lead us to believe otherwise. Maybe I am missing something. > Store data for immutable tables in single KeyValue > -- > > Key: PHOENIX-2565 > URL: https://issues.apache.org/jira/browse/PHOENIX-2565 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Thomas D'Silva > Attachments: PHOENIX-2565-v2.patch, PHOENIX-2565-wip.patch, > PHOENIX-2565.patch > > > Since an immutable table (i.e. declared with IMMUTABLE_ROWS=true) will never > update a column value, it'd be more efficient to store all column values for > a row in a single KeyValue. We could use the existing format we have for > variable length arrays. > For backward compatibility, we'd need to support the current mechanism. Also, > you'd no longer be allowed to transition an existing table to/from being > immutable. I think the best approach would be to introduce a new IMMUTABLE > keyword and use it like this: > {code} > CREATE IMMUTABLE TABLE ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2565) Store data for immutable tables in single KeyValue
[ https://issues.apache.org/jira/browse/PHOENIX-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799459#comment-15799459 ] Enis Soztutar commented on PHOENIX-2565: >From the experience of trying to use this for billions of rows and hundreds of >columns (where the schema is a regular RDBMS one), there are a couple of >problems that the array encoding has in terms of packing data efficiently. - Array encoding uses all three of separators, and offsets / lengths, as well as nullability encoding. This means that there is a lot of unnecessary overhead for representing repetitive information. - Run-length encoding-like null representation gets really expensive, if you have data like {{a, , b, , c, }}. A simple bitset is easier and more efficien. Or, if you are already encoding the offsets, you do not have to re-encode nullability. If offset_i and offset_i+1 are equal, the field is null. - The offsets are 4 or 2 bytes fixed length, not using varint encoding. This makes a difference for majority of data where expected num columns is <128. I think array encoding is this way because arrays can be part of the row key. However, for packing column values, we do not need the lexicographic sortable guarantee, meaning that we can do a way better job than the array encoding. The way forward for this I think is to leave the array encoding as it is, but instead do a PStructDataType that implements the new scheme. This is the exact problem that avro / PB and Thrift encodings solve already. However, the requirements are a little different for phoenix. - First, we have to figure out how we are gonna deal with schema evolution. - We need efficient way to access individual fields within the byte array without deserializing the whole byte[] (although notice that it is already read from disk and in-memory). - Nullability support. Looking at this, I think something like Flatbuffers / Capn proto looks more like the direction (especially with the requirement that we do not want to deserialize the whole thing). If we want to do a custom format with the given encodings, I think we can do something like this: {code} ... {code} where - {{format_id}} : single byte showing the format of the data, - {{column_n}} : column data, NO separators - {{offset_n}} : byte offset of the nth column. It can be varint, if we can cache this data. Otherwise, can make this 1/2/4 bytes and encode that information at the tail. - {{offset_start}}: this is the offset of . The reader can find and cache how many columns are there in the encoded data by reading all of the offsets. Notice that we can only add columns to an existing table, and the schema is still in the catalog table. Columns not used anymore are always null. To read a column, you would find the offset of the column, and the length would be {{offset_n+1}} - {{offset_n}}. If a column is null, it is always encoded as 0 bytes, and {{offset_n+1}} would be equal to {{offset_n}}. > Store data for immutable tables in single KeyValue > -- > > Key: PHOENIX-2565 > URL: https://issues.apache.org/jira/browse/PHOENIX-2565 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Thomas D'Silva > Attachments: PHOENIX-2565-v2.patch, PHOENIX-2565-wip.patch, > PHOENIX-2565.patch > > > Since an immutable table (i.e. declared with IMMUTABLE_ROWS=true) will never > update a column value, it'd be more efficient to store all column values for > a row in a single KeyValue. We could use the existing format we have for > variable length arrays. > For backward compatibility, we'd need to support the current mechanism. Also, > you'd no longer be allowed to transition an existing table to/from being > immutable. I think the best approach would be to introduce a new IMMUTABLE > keyword and use it like this: > {code} > CREATE IMMUTABLE TABLE ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2935) IndexMetaData cache can expire when a delete and or query running on server
[ https://issues.apache.org/jira/browse/PHOENIX-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563474#comment-15563474 ] Enis Soztutar commented on PHOENIX-2935: bq. Problem could be a size of scan object become little large(by 2KB+ with 15-30 indexes) and RPC could be little slow which would be aggravated with large number of guidePosts or region in range for the query but not sure if this is huge side effect against failing the query on cache expiry. Agreed, however, it should be acceptable I think given that this is for UPSERT SELECT kind of queries where the expected runtime of the query is already high to offset the extra cost in serialization of index metadata. +1 for the patch. > IndexMetaData cache can expire when a delete and or query running on server > --- > > Key: PHOENIX-2935 > URL: https://issues.apache.org/jira/browse/PHOENIX-2935 > Project: Phoenix > Issue Type: Bug >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Labels: index > Fix For: 4.9.0 > > Attachments: PHOENIX-2935.patch > > > IndexMetaData cache can expire when a delete or upsert query is running on > server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3360) Secondary index configuration is wrong
[ https://issues.apache.org/jira/browse/PHOENIX-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553918#comment-15553918 ] Enis Soztutar commented on PHOENIX-3360: Thanks James for taking a look. bq. In this case, we know that any RPC is due to an index needing to be updated (hence there's not check) Not necessarily. RS -> RS communications can happen for other reasons. Meta updates, stats updates, ACL updates etc. We definitely do not want to use an index priority unintentionally for stats update or ACL update, etc. This has been such a fragile area that getting this wrong results in horrible deadlocks (see HBASE-16773 as a recent example). bq. The intention is that ServerRpcControllerFactory is only used on the server The clients running on the same nodes as the servers usually share the same configuration. Not all deployments are like that, but I can say this is probably more common that the HBASE_CONF_DIR is shared. This means client requests that originate from the same nodes that run a server will go to the wrong queues. bq. I would love to use HBASE-15816 as we could get rid of all this RpcControllerFactory nonsense (which we only used because we have no HBase API to set the priority). This would make it much more clear. I think we should get that in HBase in any case. The problem with relying on HBASE-15816 would be that ALL requests should be explicitly tagged with the correct priority. If we accidentally miss a couple, then we can still deadlock. Code changes and we add new stuff, new RPCs, etc all the time. I would rather try to find a future-proof solution rather than relying on code reviews for making sure that all RPCs are marked. In HBase, we used to have more server-side computation of the RPC priority (AnnotationReadingPriorityFunction), which we realized quite costly because the RPC reader threads have to do the work to find the correct queue for the scheduler. We are pushing more and more to set the priority at the client side via RpcControllers. The problem is that the RPCController does not have any other information about the request other than the table name. In case of requests to hbase:meta table, or SYSTEM.CATALOG, etc it works perfectly. In case of Index tables, it fails because we cannot tell whether the table is an index table by just looking at the name of the table. I was thinking of adding a HTD cache or something in the RPCController so that the priority can come from the table descriptor. However, this means that we have to do a master request to fetch the HTD first. I think that won't work. > Secondary index configuration is wrong > -- > > Key: PHOENIX-3360 > URL: https://issues.apache.org/jira/browse/PHOENIX-3360 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Priority: Critical > > IndexRpcScheduler allocates some handler threads and uses a higher priority > for RPCs. The corresponding IndexRpcController is not used by default as it > is, but used through ServerRpcControllerFactory that we configure from Ambari > by default which sets the priority of the outgoing RPCs to either metadata > priority, or the index priority. > However, after reading code of IndexRpcController / ServerRpcController it > seems that the IndexRPCController DOES NOT look at whether the outgoing RPC > is for an Index table or not. It just sets ALL rpc priorities to be the index > priority. The intention seems to be the case that ONLY on servers, we > configure ServerRpcControllerFactory, and with clients we NEVER configure > ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We > configure ServerRpcControllerFactory from Ambari, which in affect makes it so > that ALL rpcs from Phoenix are only handled by the index handlers by default. > It means all deadlock cases are still there. > The documentation in https://phoenix.apache.org/secondary_indexing.html is > also wrong in this sense. It does not talk about server side / client side. > Plus this way of configuring different values is not how HBase configuration > is deployed. We cannot have the configuration show the > ServerRpcControllerFactory even only for server nodes, because the clients > running on those nodes will also see the wrong values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3360) Secondary index configuration is wrong
[ https://issues.apache.org/jira/browse/PHOENIX-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553867#comment-15553867 ] Enis Soztutar commented on PHOENIX-3360: There are possible fixes: - Keep a form of ServerRpcController, but set it dynamically for the Configuration / Connection that is used by the IndexWriters. ClientRpcController seems to be already set for the Phoenix connections (I did not verify the priorities). - Dynamically use the per-operation priorities for index updates (HBASE-15816) for all index requests. - Have an RpcController or scheduler (in HBase) that uses the table's default priority levels to schedule. This combines nicely with PHOENIX-3072 + HBASE-16095. > Secondary index configuration is wrong > -- > > Key: PHOENIX-3360 > URL: https://issues.apache.org/jira/browse/PHOENIX-3360 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Priority: Critical > > IndexRpcScheduler allocates some handler threads and uses a higher priority > for RPCs. The corresponding IndexRpcController is not used by default as it > is, but used through ServerRpcControllerFactory that we configure from Ambari > by default which sets the priority of the outgoing RPCs to either metadata > priority, or the index priority. > However, after reading code of IndexRpcController / ServerRpcController it > seems that the IndexRPCController DOES NOT look at whether the outgoing RPC > is for an Index table or not. It just sets ALL rpc priorities to be the index > priority. The intention seems to be the case that ONLY on servers, we > configure ServerRpcControllerFactory, and with clients we NEVER configure > ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We > configure ServerRpcControllerFactory from Ambari, which in affect makes it so > that ALL rpcs from Phoenix are only handled by the index handlers by default. > It means all deadlock cases are still there. > The documentation in https://phoenix.apache.org/secondary_indexing.html is > also wrong in this sense. It does not talk about server side / client side. > Plus this way of configuring different values is not how HBase configuration > is deployed. We cannot have the configuration show the > ServerRpcControllerFactory even only for server nodes, because the clients > running on those nodes will also see the wrong values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3360) Secondary index configuration is wrong
Enis Soztutar created PHOENIX-3360: -- Summary: Secondary index configuration is wrong Key: PHOENIX-3360 URL: https://issues.apache.org/jira/browse/PHOENIX-3360 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Priority: Critical IndexRpcScheduler allocates some handler threads and uses a higher priority for RPCs. The corresponding IndexRpcController is not used by default as it is, but used through ServerRpcControllerFactory that we configure from Ambari by default which sets the priority of the outgoing RPCs to either metadata priority, or the index priority. However, after reading code of IndexRpcController / ServerRpcController it seems that the IndexRPCController DOES NOT look at whether the outgoing RPC is for an Index table or not. It just sets ALL rpc priorities to be the index priority. The intention seems to be the case that ONLY on servers, we configure ServerRpcControllerFactory, and with clients we NEVER configure ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We configure ServerRpcControllerFactory from Ambari, which in affect makes it so that ALL rpcs from Phoenix are only handled by the index handlers by default. It means all deadlock cases are still there. The documentation in https://phoenix.apache.org/secondary_indexing.html is also wrong in this sense. It does not talk about server side / client side. Plus this way of configuring different values is not how HBase configuration is deployed. We cannot have the configuration show the ServerRpcControllerFactory even only for server nodes, because the clients running on those nodes will also see the wrong values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491395#comment-15491395 ] Enis Soztutar commented on PHOENIX-3072: Thanks James, Josh. > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: PHOENIX-3072_v3.patch, PHOENIX-3072_v4.patch, > phoenix-3072_v1.patch, phoenix-3072_v2.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488591#comment-15488591 ] Enis Soztutar commented on PHOENIX-3072: Created PHOENIX-3274 as a follow up. Agreed that we can commit this, and do the upgrade handling there. > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: phoenix-3072_v1.patch, phoenix-3072_v2.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3274) Alter the PRIORITY of existing tables after PHOENIX-3072
Enis Soztutar created PHOENIX-3274: -- Summary: Alter the PRIORITY of existing tables after PHOENIX-3072 Key: PHOENIX-3274 URL: https://issues.apache.org/jira/browse/PHOENIX-3274 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 4.9.0 This is a follow up work after PHOENIX-3072. From the discussion, it is better to alter the existing table after an upgrade and set the PRIORITY definitions as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488407#comment-15488407 ] Enis Soztutar commented on PHOENIX-3072: bq. It's difficult to tell what's changed with all the whitespace diffs. Can you generate a patch without that? Sure. We should however clean up the code base. It has a lot of whitespace and indentation issues already. bq. It looks like you're setting a new "PRIORITY" attribute on table descriptor for indexes? How/where is this used? (never mind on this - I see it's part of an HBase JIRA). Yep, it is introduced and used via HBASE-16095. bq. How will you handle local indexes since the table descriptor is the same data and index table Local indexes will not have this problem since they are in the same table. There is no inter-dependency between index regions and data table regions. bq. Minor nit: is I suppose you're not using the HBase static constant for "PRIORITY" because this doesn't appear until HBase 1.3? Maybe we should define one in QueryConstants with a comment? Done in v2. bq. Didn't priority get exposed as an attribute on operations now? If so, would that be an alternate implementation mechanism which is a bit more flexible? This is not related to RPCs at all. The deadlock happens at region opening. BTW the patch for per-operation priorities is not committed yet I think on the HBase side. bq. What about existing tables and indexes - I didn't see any upgrade code that sets this for those. If setting priority on operation is an option, that'd get around this. I've thought about this, but it seems dangerous to alter the existing tables when Phoenix is upgraded. That is why there is no upgrade handling. Altering an existing table might have implications on availability, etc. Do we do this kind of alter for other features on Phoenix upgrade? If so we can hook into that. > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: phoenix-3072_v1.patch, phoenix-3072_v2.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475018#comment-15475018 ] Enis Soztutar commented on PHOENIX-3072: We should commit this for 4.8.1. We have seen this in multiple production clusters already. Although there is a known workaround, it is a hassle and very user un-friendly. > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: phoenix-3072_v1.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475015#comment-15475015 ] Enis Soztutar commented on PHOENIX-3072: bq. On the RS, we already make index table updates higher priority than data table updates This happens on the region open, and does not involve the RPC scheduling. In a cluster restart, all of the index and data table regions will be opened by the regionservers. There is only 3 threads that does the opening of regions by default, and for the data tables, the opening of the region blocks on doing the index updates. However, if the index regions are not opened yet, then they will not succeed even if the regionserver RPC scheduling works. The index regions will be waiting on the same "region opening queue" to be opened by the same regionserver. bq. Also, would you mind generating a patch that ignores whitespace changes as it's difficult to find the change you've made. Sorry, the existing code is full with extra whitespace, and my Eclipse settings is to truncate these as a save action. This is to make sure that my patches do not introduce any more extra whitespaces. I can put the patch in RB/github if you want. > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: phoenix-3072_v1.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3111) Possible Deadlock/delay while building index, upsert select, delete rows at server
[ https://issues.apache.org/jira/browse/PHOENIX-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406847#comment-15406847 ] Enis Soztutar commented on PHOENIX-3111: Indeed. We have a test suite which is supposed to run for 48 hours, that does concurrent index create, bulk load, upserts etc. [~sergey.soldatov] and [~speleato] might be able to give more details. After we get the 4.8 out, we can focus on how to share those kinds of tests via {{hbase-it}} like module. > Possible Deadlock/delay while building index, upsert select, delete rows at > server > -- > > Key: PHOENIX-3111 > URL: https://issues.apache.org/jira/browse/PHOENIX-3111 > Project: Phoenix > Issue Type: Bug >Reporter: Sergio Peleato >Assignee: Rajeshbabu Chintaguntla >Priority: Critical > Fix For: 4.8.0 > > Attachments: PHOENIX-3111.patch, PHOENIX-3111_addendum.patch, > PHOENIX-3111_v2.patch > > > There is a possible deadlock while building local index or running upsert > select, delete at server. The situation might happen in this case. > In the above queries we scan mutations from table and write back to same > table in that case there is a chance of memstore might reach the threshold of > blocking memstore size then RegionTooBusyException might be thrown back to > client and queries might retry scanning. > Let's suppose if we take a local index build index case we first scan from > the data table and prepare index mutations and write back to same table. > So there is chance of memstore full as well in that case we try to flush the > region. But if the split happen in between then split might be waiting for > write lock on the region to close and flush wait for readlock because the > write lock in the queue until the local index build completed. Local index > build won't complete because we are not allowed to write until there is > flush. This might not be complete deadlock situation but the queries might > take lot of time to complete in this cases. > {noformat} > "regionserver//192.168.0.53:16201-splits-1469165876186" #269 prio=5 > os_prio=31 tid=0x7f7fb2050800 nid=0x1c033 waiting on condition > [0x000139b68000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0006ede72550> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1422) > at > org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1370) > - locked <0x0006ede69d00> (a java.lang.Object) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:394) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - <0x0006ee132098> (a > java.util.concurrent.ThreadPoolExecutor$Worker) > {noformat} > {noformat} > "MemStoreFlusher.0" #170 prio=5 os_prio=31 tid=0x7f7fb6842000 nid=0x19303 > waiting on condition [0x0001388e9000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0006ede72550> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at >
[jira] [Commented] (PHOENIX-3126) The driver implementation should take into account the context of the user
[ https://issues.apache.org/jira/browse/PHOENIX-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402802#comment-15402802 ] Enis Soztutar commented on PHOENIX-3126: There is {{UserProvider.getCurrent()}} for testing purposes BTW. > The driver implementation should take into account the context of the user > -- > > Key: PHOENIX-3126 > URL: https://issues.apache.org/jira/browse/PHOENIX-3126 > Project: Phoenix > Issue Type: Bug >Reporter: Devaraj Das > Attachments: PHOENIX-3126.txt, .java > > > Ran into this issue ... > We have an application that proxies various users internally and fires > queries for those users. The Phoenix driver implementation caches connections > it successfully creates and keys it by the ConnectionInfo. The ConnectionInfo > doesn't take into consideration the "user". So random users (including those > that aren't supposed to access) can access the tables in this sort of a setup. > The fix is to also consider the User in the ConnectionInfo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3126) The driver implementation should take into account the context of the user
[ https://issues.apache.org/jira/browse/PHOENIX-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402796#comment-15402796 ] Enis Soztutar commented on PHOENIX-3126: +1 for 4.8. This looks like a serious enough issue. Both HConnectionKey (at the HConnection level) and ConnectionId at the tcp level connection to RS's contain the UGI information for HBase. Doing this for Phoenix makes sense. > The driver implementation should take into account the context of the user > -- > > Key: PHOENIX-3126 > URL: https://issues.apache.org/jira/browse/PHOENIX-3126 > Project: Phoenix > Issue Type: Bug >Reporter: Devaraj Das > Attachments: PHOENIX-3126.txt, .java > > > Ran into this issue ... > We have an application that proxies various users internally and fires > queries for those users. The Phoenix driver implementation caches connections > it successfully creates and keys it by the ConnectionInfo. The ConnectionInfo > doesn't take into consideration the "user". So random users (including those > that aren't supposed to access) can access the tables in this sort of a setup. > The fix is to also consider the User in the ConnectionInfo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3111) Possible Deadlock/delay while building index, upsert select, delete rows at server
[ https://issues.apache.org/jira/browse/PHOENIX-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400377#comment-15400377 ] Enis Soztutar commented on PHOENIX-3111: Indeed. +1 from me for 4.8. However, we should make sure that the thread interrupt status is restored if we are catching these interruptions: {code} + } catch (InterruptedException e) { {code} > Possible Deadlock/delay while building index, upsert select, delete rows at > server > -- > > Key: PHOENIX-3111 > URL: https://issues.apache.org/jira/browse/PHOENIX-3111 > Project: Phoenix > Issue Type: Bug >Reporter: Sergio Peleato >Assignee: Rajeshbabu Chintaguntla >Priority: Critical > Fix For: 4.8.0 > > Attachments: PHOENIX-3111.patch > > > There is a possible deadlock while building local index or running upsert > select, delete at server. The situation might happen in this case. > In the above queries we scan mutations from table and write back to same > table in that case there is a chance of memstore might reach the threshold of > blocking memstore size then RegionTooBusyException might be thrown back to > client and queries might retry scanning. > Let's suppose if we take a local index build index case we first scan from > the data table and prepare index mutations and write back to same table. > So there is chance of memstore full as well in that case we try to flush the > region. But if the split happen in between then split might be waiting for > write lock on the region to close and flush wait for readlock because the > write lock in the queue until the local index build completed. Local index > build won't complete because we are not allowed to write until there is > flush. This might not be complete deadlock situation but the queries might > take lot of time to complete in this cases. > {noformat} > "regionserver//192.168.0.53:16201-splits-1469165876186" #269 prio=5 > os_prio=31 tid=0x7f7fb2050800 nid=0x1c033 waiting on condition > [0x000139b68000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0006ede72550> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1422) > at > org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1370) > - locked <0x0006ede69d00> (a java.lang.Object) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:394) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - <0x0006ee132098> (a > java.util.concurrent.ThreadPoolExecutor$Worker) > {noformat} > {noformat} > "MemStoreFlusher.0" #170 prio=5 os_prio=31 tid=0x7f7fb6842000 nid=0x19303 > waiting on condition [0x0001388e9000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0006ede72550> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) > at >
[jira] [Commented] (PHOENIX-3111) Possible Deadlock/delay while building index, upsert select, delete rows at server
[ https://issues.apache.org/jira/browse/PHOENIX-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399906#comment-15399906 ] Enis Soztutar commented on PHOENIX-3111: I think a similar thing like the one here is happening: https://issues.apache.org/jira/browse/PHOENIX-2667?focusedCommentId=15138214=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15138214. Because of the RW lock heuristics to avoid starvation, once a write lock request is queued up, no more read locks can be acquired. So if an operation (like scan -> batchMutate) has the read lock, and needs to do flush() which requires another read lock, that may get deadlocked if a write-lock request comes in between. > Possible Deadlock/delay while building index, upsert select, delete rows at > server > -- > > Key: PHOENIX-3111 > URL: https://issues.apache.org/jira/browse/PHOENIX-3111 > Project: Phoenix > Issue Type: Bug >Reporter: Sergio Peleato >Assignee: Rajeshbabu Chintaguntla >Priority: Critical > Fix For: 4.8.0 > > Attachments: PHOENIX-3111.patch > > > There is a possible deadlock while building local index or running upsert > select, delete at server. The situation might happen in this case. > In the above queries we scan mutations from table and write back to same > table in that case there is a chance of memstore might reach the threshold of > blocking memstore size then RegionTooBusyException might be thrown back to > client and queries might retry scanning. > Let's suppose if we take a local index build index case we first scan from > the data table and prepare index mutations and write back to same table. > So there is chance of memstore full as well in that case we try to flush the > region. But if the split happen in between then split might be waiting for > write lock on the region to close and flush wait for readlock because the > write lock in the queue until the local index build completed. Local index > build won't complete because we are not allowed to write until there is > flush. This might not be complete deadlock situation but the queries might > take lot of time to complete in this cases. > {noformat} > "regionserver//192.168.0.53:16201-splits-1469165876186" #269 prio=5 > os_prio=31 tid=0x7f7fb2050800 nid=0x1c033 waiting on condition > [0x000139b68000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0006ede72550> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1422) > at > org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1370) > - locked <0x0006ede69d00> (a java.lang.Object) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:394) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - <0x0006ee132098> (a > java.util.concurrent.ThreadPoolExecutor$Worker) > {noformat} > {noformat} > "MemStoreFlusher.0" #170 prio=5 os_prio=31 tid=0x7f7fb6842000 nid=0x19303 > waiting on condition [0x0001388e9000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0006ede72550> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at >
[jira] [Commented] (PHOENIX-3098) Possible NegativeArraySizeException while scanning local indexes during regions merge
[ https://issues.apache.org/jira/browse/PHOENIX-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386954#comment-15386954 ] Enis Soztutar commented on PHOENIX-3098: +1. This was reproducing in internal tests very easily. We want this in the new 4.8 RCs I think. [~an...@apache.org]. > Possible NegativeArraySizeException while scanning local indexes during > regions merge > -- > > Key: PHOENIX-3098 > URL: https://issues.apache.org/jira/browse/PHOENIX-3098 > Project: Phoenix > Issue Type: Bug >Reporter: Sergio Peleato >Assignee: Rajeshbabu Chintaguntla > Fix For: 4.8.0 > > Attachments: PHOENIX-3098.patch > > > While scanning local indexes during regions merge we might end up with > NegativeArraySizeException which leads to RS down. The reason for this is > some times HBase won't do real seek and considered fake keyvalues(can be scan > start row) as seeked kvs. In that case we ended up with this issue when we > call peek without seek. So for local indexes we need to enforce seek all the > time for scanning local index reference files. > {noformat} > 2016-07-15 17:27:04,419 ERROR > [B.fifo.QRpcServer.handler=8,queue=2,port=16020] coprocessor.CoprocessorHost: > The coprocessor > org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReaderGenerator threw > java.lang.NegativeArraySizeException > java.lang.NegativeArraySizeException > at > org.apache.hadoop.hbase.regionserver.LocalIndexStoreFileScanner.getNewRowkeyByRegionStartKeyReplacedWithSplitKey(LocalIndexStoreFileScanner.java:242) > at > org.apache.hadoop.hbase.regionserver.LocalIndexStoreFileScanner.getChangedKey(LocalIndexStoreFileScanner.java:76) > at > org.apache.hadoop.hbase.regionserver.LocalIndexStoreFileScanner.peek(LocalIndexStoreFileScanner.java:68) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.(KeyValueHeap.java:87) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.(KeyValueHeap.java:71) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.resetKVHeap(StoreScanner.java:378) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:227) > at > org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReaderGenerator$1.(IndexHalfStoreFileReaderGenerator.java:259) > at > org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReaderGenerator.preStoreScannerOpen(IndexHalfStoreFileReaderGenerator.java:258) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$51.call(RegionCoprocessorHost.java:1284) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1638) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1712) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1677) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preStoreScannerOpen(RegionCoprocessorHost.java:1279) > at > org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2110) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5568) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2626) > at > org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2612) > at > org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2594) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2271) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32205) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2127) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Thanks [~speleato] for finding this issue. Added you as reporter for this > issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3075) Phoenix-hive module is writing under ./build instead of ./target
Enis Soztutar created PHOENIX-3075: -- Summary: Phoenix-hive module is writing under ./build instead of ./target Key: PHOENIX-3075 URL: https://issues.apache.org/jira/browse/PHOENIX-3075 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Fix For: 4.9.0 Running tests on phoenix-hive, we are writing under ./build and ./target, instead of specific ./target/test-data/: {code} ./build/test/data/dfs/data/data1 ./build/test/data/dfs/data/data1/current ./build/test/data/dfs/data/data1/current/BP-1628037287-10.22.8.221-1468541463865 ./build/test/data/dfs/data/data1/current/BP-1628037287-10.22.8.221-1468541463865/current ./build/test/data/dfs/data/data1/current/BP-1628037287-10.22.8.221-1468541463865/current/dfsUsed ./target/MiniMRCluster_1052289061 ./target/MiniMRCluster_1052289061/MiniMRCluster_1052289061-localDir-nm-0_0 ./target/MiniMRCluster_1052289061/MiniMRCluster_1052289061-localDir-nm-0_1 ./target/MiniMRCluster_1052289061/MiniMRCluster_1052289061-localDir-nm-0_2 ./target/MiniMRCluster_1052289061/MiniMRCluster_1052289061-localDir-nm-0_3 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-3072: --- Attachment: phoenix-3072_v1.patch Here is a v1 patch: - We assign metadata priority to all SYSTEM tables. - We assign index priority to all secondary index tables The priorities will work if HBASE-16095 is there, otherwise hbase will just ignore the table property. > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: phoenix-3072_v1.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
Enis Soztutar created PHOENIX-3072: -- Summary: Deadlock on region opening with secondary index recovery Key: PHOENIX-3072 URL: https://issues.apache.org/jira/browse/PHOENIX-3072 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 4.9.0 There is a distributed deadlock happening in clusters with some moderate number of regions for the data tables and secondary index tables and cluster and it is cluster restart or some large failure. We have seen this in a couple of production cases already. Opening of regions in hbase is performed by a thread pool with 3 threads by default. Every regionserver can open 3 regions at a time. However, opening data table regions has to write to multiple index regions during WAL recovery. All other region open requests are queued up in a single queue. This causes a deadlock, since the secondary index regions are also opened by the same thread pools that we do the work. So if there is greater number of data table regions then available number of region opening threads from regionservers, the secondary index region open requests just wait to be processed in the queue. Since these index regions are not open, the region opening of data table regions just block the region opening threads for a long time. One proposed fix is to use a different thread pool for opening regions of the secondary index tables so that we will not deadlock. See HBASE-16095 for the HBase-level fix. In Phoenix, we just have to set the priority for secondary index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-3072: --- Fix Version/s: 4.8.1 > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3062) JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang
[ https://issues.apache.org/jira/browse/PHOENIX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376143#comment-15376143 ] Enis Soztutar commented on PHOENIX-3062: Thanks Josh. I've committed the hbase fix, but I'm holding for the RC to go out. > JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT > to hang > > > Key: PHOENIX-3062 > URL: https://issues.apache.org/jira/browse/PHOENIX-3062 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0 > > Attachments: phoenix-3062_v1.patch > > > With some recent fixes in the hbase metrics system, we are now affectively > restarting the metrics system (in HBase-1.3.0, probably not affecting 1.2.0). > Since we use a custom sink in the PhoenixTracingEndToEndIT, restarting the > metrics system loses the registered sink thus causing a hang. > We need a fix in HBase, and Phoenix so that we will not restart the metrics > during tests. > Thanks to [~sergey.soldatov] for analyzing the initial root cause of the > hang. > See HBASE-14166 and others. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3067) Phoenix metrics system should not be started in pseudo-cluster mode
[ https://issues.apache.org/jira/browse/PHOENIX-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-3067: --- Attachment: phoenix-3067_v1.patch Simple patch. We are manually testing this with a pseudo-cluster. > Phoenix metrics system should not be started in pseudo-cluster mode > --- > > Key: PHOENIX-3067 > URL: https://issues.apache.org/jira/browse/PHOENIX-3067 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0 > > Attachments: phoenix-3067_v1.patch > > > Phoenix tracing piggy-backs on the metrics system by specifying a > SpanReceiver which is also a MetricsSource (TraceMetricsSource). > This works differently on client side versus server side. The hadoop metrics > system should only be initialized once, and can only have a single prefix > (hbase or phoenix, etc). We configure the metric sink through > hadoop-metrics2.properties differently in client side versus server side [1]. > Hadoop metric system is designed so that if it is initialized already with > some prefix (like hbase), re-initializing it again will be ignored unless it > is in "mini-cluster mode". We do not check whether the metrics is already > initialized from {{Metrics.java}} and blindly call > {{DefaultMetricsSystem.instance().init("phoenix")}} which works as long as we > are in distributed mode. Otherwise, the metrics sinks do not work. > [1] https://phoenix.apache.org/tracing.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3067) Phoenix metrics system should not be started in mini-cluster mode
[ https://issues.apache.org/jira/browse/PHOENIX-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-3067: --- Summary: Phoenix metrics system should not be started in mini-cluster mode (was: Phoenix metrics system should not be started in pseudo-cluster mode) > Phoenix metrics system should not be started in mini-cluster mode > - > > Key: PHOENIX-3067 > URL: https://issues.apache.org/jira/browse/PHOENIX-3067 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0 > > Attachments: phoenix-3067_v1.patch > > > Phoenix tracing piggy-backs on the metrics system by specifying a > SpanReceiver which is also a MetricsSource (TraceMetricsSource). > This works differently on client side versus server side. The hadoop metrics > system should only be initialized once, and can only have a single prefix > (hbase or phoenix, etc). We configure the metric sink through > hadoop-metrics2.properties differently in client side versus server side [1]. > Hadoop metric system is designed so that if it is initialized already with > some prefix (like hbase), re-initializing it again will be ignored unless it > is in "mini-cluster mode". We do not check whether the metrics is already > initialized from {{Metrics.java}} and blindly call > {{DefaultMetricsSystem.instance().init("phoenix")}} which works as long as we > are in distributed mode. Otherwise, the metrics sinks do not work. > [1] https://phoenix.apache.org/tracing.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3067) Phoenix metrics system should not be started in pseudo-cluster mode
[ https://issues.apache.org/jira/browse/PHOENIX-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373474#comment-15373474 ] Enis Soztutar commented on PHOENIX-3067: Since there is no API to check whether metrics system is initialized or not, or whether we are in the server side or client side, a simple check to see whether we are in mini-cluster mode should do the trick. MetricsSystemImpl.init(): {code} public synchronized MetricsSystem init(String prefix) { if (monitoring && !DefaultMetricsSystem.inMiniClusterMode()) { LOG.warn(this.prefix +" metrics system already initialized!"); return this; } this.prefix = checkNotNull(prefix, "prefix"); ++refCount; if (monitoring) { // in mini cluster mode LOG.info(this.prefix +" metrics system started (again)"); return this; } {code} > Phoenix metrics system should not be started in pseudo-cluster mode > --- > > Key: PHOENIX-3067 > URL: https://issues.apache.org/jira/browse/PHOENIX-3067 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0 > > > Phoenix tracing piggy-backs on the metrics system by specifying a > SpanReceiver which is also a MetricsSource (TraceMetricsSource). > This works differently on client side versus server side. The hadoop metrics > system should only be initialized once, and can only have a single prefix > (hbase or phoenix, etc). We configure the metric sink through > hadoop-metrics2.properties differently in client side versus server side [1]. > Hadoop metric system is designed so that if it is initialized already with > some prefix (like hbase), re-initializing it again will be ignored unless it > is in "mini-cluster mode". We do not check whether the metrics is already > initialized from {{Metrics.java}} and blindly call > {{DefaultMetricsSystem.instance().init("phoenix")}} which works as long as we > are in distributed mode. Otherwise, the metrics sinks do not work. > [1] https://phoenix.apache.org/tracing.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3067) Phoenix metrics system should not be started in pseudo-cluster mode
Enis Soztutar created PHOENIX-3067: -- Summary: Phoenix metrics system should not be started in pseudo-cluster mode Key: PHOENIX-3067 URL: https://issues.apache.org/jira/browse/PHOENIX-3067 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 4.9.0 Phoenix tracing piggy-backs on the metrics system by specifying a SpanReceiver which is also a MetricsSource (TraceMetricsSource). This works differently on client side versus server side. The hadoop metrics system should only be initialized once, and can only have a single prefix (hbase or phoenix, etc). We configure the metric sink through hadoop-metrics2.properties differently in client side versus server side [1]. Hadoop metric system is designed so that if it is initialized already with some prefix (like hbase), re-initializing it again will be ignored unless it is in "mini-cluster mode". We do not check whether the metrics is already initialized from {{Metrics.java}} and blindly call {{DefaultMetricsSystem.instance().init("phoenix")}} which works as long as we are in distributed mode. Otherwise, the metrics sinks do not work. [1] https://phoenix.apache.org/tracing.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3045) Data regions in transition forever if RS holding them down during drop index
[ https://issues.apache.org/jira/browse/PHOENIX-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373428#comment-15373428 ] Enis Soztutar commented on PHOENIX-3045: Patch looks good. Do we need to handle disabled index tables as well, or throwing it back is the correct semantics? {code} } catch (MultiIndexWriteFailureException e) { +for (HTableInterfaceReference table : e.getFailedTables()) { +if (!admin.tableExists(table.getTableName())) { +LOG.warn("Failure due to non existing table: " + table.getTableName()); +nonExistingTablesList.add(table); +} else { +throw e; +} +} +} {code} > Data regions in transition forever if RS holding them down during drop index > > > Key: PHOENIX-3045 > URL: https://issues.apache.org/jira/browse/PHOENIX-3045 > Project: Phoenix > Issue Type: Bug >Reporter: Sergio Peleato >Assignee: Ankit Singhal > Fix For: 4.8.0 > > Attachments: PHOENIX-3045.patch, PHOENIX-3045_v1.patch, > PHOENIX-3045_v2.patch > > > There is a chance that region server holding the data regions might abruptly > killed before flushing the data table this leads same failure case that data > regions won't be opened which leads to the regions in transition forever. We > need to handle this case by checking dropped indexes on recovery write > failures and skip the corresponding mutations to write to them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2565) Store data for immutable tables in single KeyValue
[ https://issues.apache.org/jira/browse/PHOENIX-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371983#comment-15371983 ] Enis Soztutar commented on PHOENIX-2565: This is a great addition. Is there a short design doc that explains the feature? I think we should not couple immutable tables with "compact storage". There can be cases where you would want to keep columns non-compact in immutable tables, and compact for mutable tables. The features should be orthogonal. If you are mutating an mutable column value, it can be read-modify-write similar to secondary index updates. Is compact storage all or nothing for a table? What about column families? Should we look into having a subset of columns to be stored together, something like the original locality-groups versus column family from bigtable (HBase column family is both locality group and column family in the big table terminology). > Store data for immutable tables in single KeyValue > -- > > Key: PHOENIX-2565 > URL: https://issues.apache.org/jira/browse/PHOENIX-2565 > Project: Phoenix > Issue Type: Improvement >Reporter: James Taylor >Assignee: Thomas D'Silva > Fix For: 4.9.0 > > Attachments: PHOENIX-2565-wip.patch > > > Since an immutable table (i.e. declared with IMMUTABLE_ROWS=true) will never > update a column value, it'd be more efficient to store all column values for > a row in a single KeyValue. We could use the existing format we have for > variable length arrays. > For backward compatibility, we'd need to support the current mechanism. Also, > you'd no longer be allowed to transition an existing table to/from being > immutable. I think the best approach would be to introduce a new IMMUTABLE > keyword and use it like this: > {code} > CREATE IMMUTABLE TABLE ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3062) JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang
[ https://issues.apache.org/jira/browse/PHOENIX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-3062: --- Attachment: phoenix-3062_v1.patch Simple patch. Needs the HBASE fix to work. > JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT > to hang > > > Key: PHOENIX-3062 > URL: https://issues.apache.org/jira/browse/PHOENIX-3062 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0 > > Attachments: phoenix-3062_v1.patch > > > With some recent fixes in the hbase metrics system, we are now affectively > restarting the metrics system (in HBase-1.3.0, probably not affecting 1.2.0). > Since we use a custom sink in the PhoenixTracingEndToEndIT, restarting the > metrics system loses the registered sink thus causing a hang. > We need a fix in HBase, and Phoenix so that we will not restart the metrics > during tests. > Thanks to [~sergey.soldatov] for analyzing the initial root cause of the > hang. > See HBASE-14166 and others. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3062) JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang
Enis Soztutar created PHOENIX-3062: -- Summary: JMXCacheBuster restarting the metrics system causes PhoenixTracingEndToEndIT to hang Key: PHOENIX-3062 URL: https://issues.apache.org/jira/browse/PHOENIX-3062 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 4.9.0 With some recent fixes in the hbase metrics system, we are now affectively restarting the metrics system (in HBase-1.3.0, probably not affecting 1.2.0). Since we use a custom sink in the PhoenixTracingEndToEndIT, restarting the metrics system loses the registered sink thus causing a hang. We need a fix in HBase, and Phoenix so that we will not restart the metrics during tests. Thanks to [~sergey.soldatov] for analyzing the initial root cause of the hang. See HBASE-14166 and others. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3057) Set incremental=false for sqlline-thin.py like sqlline.py
[ https://issues.apache.org/jira/browse/PHOENIX-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366900#comment-15366900 ] Enis Soztutar commented on PHOENIX-3057: +1. > Set incremental=false for sqlline-thin.py like sqlline.py > -- > > Key: PHOENIX-3057 > URL: https://issues.apache.org/jira/browse/PHOENIX-3057 > Project: Phoenix > Issue Type: Bug >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Trivial > Fix For: 4.9.0 > > Attachments: PHOENIX-3057.patch > > > Same as PHOENIX-2580 for sqlline-thin. Keeps things consistent whether users > are using the thick or thin driver. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3055) Fix a few resource leaks and null dereferences reported by Coverity
[ https://issues.apache.org/jira/browse/PHOENIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366556#comment-15366556 ] Enis Soztutar commented on PHOENIX-3055: bq. Though seemingly innocuous, changes like this can lead to subtle issues. Also, adding null checks everywhere is not always necessary. I'd recommend holding off on any changes like this this late in the game for 4.8. Would be great if we could come up with a list of the handful of JIRAs we're targeting for 4.8 and stick to only working on those. Agreed, that was my concern as well. The changes for closing the statements seems to be very low risk. I was more concerned about the closing Connection objects from the MetaDataEndpointImp. Let's run the tests, but hold off on the commit until after 4.8. > Fix a few resource leaks and null dereferences reported by Coverity > --- > > Key: PHOENIX-3055 > URL: https://issues.apache.org/jira/browse/PHOENIX-3055 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Attachments: PHOENIX-3055.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3055) Fix a few resource leaks and null dereferences reported by Coverity
[ https://issues.apache.org/jira/browse/PHOENIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365453#comment-15365453 ] Enis Soztutar commented on PHOENIX-3055: This looks fine to me. This change should not be needed: {code} diff --git a/phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java b/phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java index 77dccb1..7e4c883 100644 --- a/phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java +++ b/phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java @@ -3427,6 +3427,7 @@ public class MetaDataClient { try (PhoenixConnection tenantConn = DriverManager.getConnection(connection.getURL(), props).unwrap(PhoenixConnection.class)) { PostDDLCompiler dropCompiler = new PostDDLCompiler(tenantConn); tenantConn.getQueryServices().updateData(dropCompiler.compile(entry.getValue(), null, null, Collections.emptyList(), ts)); +tenantConn.close(); {code} since {{tenantConn}} is already closed by the {{try}} construct. Did you run the tests with this? > Fix a few resource leaks and null dereferences reported by Coverity > --- > > Key: PHOENIX-3055 > URL: https://issues.apache.org/jira/browse/PHOENIX-3055 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Fix For: 4.8.0 > > Attachments: PHOENIX-3055.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3045) Data regions in transition forever if RS holding them down during drop index and before flush on data table
[ https://issues.apache.org/jira/browse/PHOENIX-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365093#comment-15365093 ] Enis Soztutar commented on PHOENIX-3045: Indeed, if there is a workaround for this issue, it maybe fine to defer, but it seems that this will result in data tables region to be permanently offline. > Data regions in transition forever if RS holding them down during drop index > and before flush on data table > > > Key: PHOENIX-3045 > URL: https://issues.apache.org/jira/browse/PHOENIX-3045 > Project: Phoenix > Issue Type: Bug >Reporter: Sergio Peleato >Assignee: Ankit Singhal > Fix For: 4.9.0 > > > Currently we are flushing data table after dropping the index so that region > opening won't be failed to recover dropped index edits. But there is a chance > that region server holding the data regions might abruptly killed before > flushing the data table this leads same failure case that data regions won't > be opened which leads to the regions in transition forever. We need to handle > this case by checking dropped indexes on recovery write failures and skip the > corresponding mutations to write to them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3037) Setup proper security context in compaction/split coprocessor hooks
[ https://issues.apache.org/jira/browse/PHOENIX-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357665#comment-15357665 ] Enis Soztutar commented on PHOENIX-3037: Looks good to me. > Setup proper security context in compaction/split coprocessor hooks > --- > > Key: PHOENIX-3037 > URL: https://issues.apache.org/jira/browse/PHOENIX-3037 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.7.0 >Reporter: Lars Hofhansl >Assignee: Andrew Purtell > Fix For: 4.8.0 > > Attachments: PHOENIX-3037-4.x-HBase-0.98.patch, > PHOENIX-3037-4.x-HBase-0.98.patch, PHOENIX-3037-4.x-HBase-1.1.patch, > PHOENIX-3037-4.x-HBase-1.1.patch, PHOENIX-3037-master.patch, > PHOENIX-3037-master.patch > > > See HBASE-16115 for a discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf
[ https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347529#comment-15347529 ] Enis Soztutar commented on PHOENIX-2931: +1. [~elserj] you also want to take a look? > Phoenix client asks users to provide configs in cli that are present on the > machine in hbase conf > - > > Key: PHOENIX-2931 > URL: https://issues.apache.org/jira/browse/PHOENIX-2931 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Fix For: 4.9.0 > > Attachments: PHOENIX-2931-v1.patch, PHOENIX-2931-v2.patch, > PHOENIX-2931-v3.patch, PHOENIX-2931-v4.patch, PHOENIX-2931.patch > > > Users had complaints on running commands like > {code} > phoenix-sqlline > pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure > service-logs.sql > {code} > However the zookeeper quorum and the port are available in hbase configs. > Phoenix should read these configs from the system instead of having the user > supply them every time. > What we can do is to introduce a keyword "default". If it is specified, > default zookeeper quorum and port will be taken from hbase configs. > Otherwise, users can specify their own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf
[ https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345161#comment-15345161 ] Enis Soztutar commented on PHOENIX-2931: Thanks Alicia, this is looking better. bq. This part of code is used by psql.py. If we did not provide connection string in the command line, the first arg would be a file. There is no guarantee the first one is a connection string. We are already checking whether the arg ends with .csv or .sql above. So my suggestion just simplifies the above, you don't have to have parameter named "j". Can you please try it like this: {code} int i = 0; // remove j here for (String arg : argList) { if (execCmd.isUpgrade || arg.endsWith(CSV_FILE_EXT) || arg.endsWith(SQL_FILE_EXT)) { inputFiles.add(arg); } else { if (i ==0) { execCmd.connectionString = arg; } else { usageError("Don't know how to interpret argument '" + arg + "'", options); } } i++; } // remove if (j > 0) check {code} I've just noticed two more things. - in ConnectionInfo.create(), we are constructing the JDBC string by concatenating, only to be parsed back immediately again to be returned as a ConnectionInfo object. We should instead directly return ConnectionInfo created from default configuration. - ConnectionInfo can carry the principal and keytab as well. [~elserj] do you know whether there are corresponding hbase-site.xml properties for these? Can we obtain them in any way for the client side if it does not come from the URL? > Phoenix client asks users to provide configs in cli that are present on the > machine in hbase conf > - > > Key: PHOENIX-2931 > URL: https://issues.apache.org/jira/browse/PHOENIX-2931 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Fix For: 4.9.0 > > Attachments: PHOENIX-2931-v1.patch, PHOENIX-2931-v2.patch, > PHOENIX-2931-v3.patch, PHOENIX-2931.patch > > > Users had complaints on running commands like > {code} > phoenix-sqlline > pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure > service-logs.sql > {code} > However the zookeeper quorum and the port are available in hbase configs. > Phoenix should read these configs from the system instead of having the user > supply them every time. > What we can do is to introduce a keyword "default". If it is specified, > default zookeeper quorum and port will be taken from hbase configs. > Otherwise, users can specify their own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf
[ https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-2931: --- Fix Version/s: (was: 4.8.0) 4.9.0 > Phoenix client asks users to provide configs in cli that are present on the > machine in hbase conf > - > > Key: PHOENIX-2931 > URL: https://issues.apache.org/jira/browse/PHOENIX-2931 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Fix For: 4.9.0 > > Attachments: PHOENIX-2931-v1.patch, PHOENIX-2931-v2.patch, > PHOENIX-2931-v3.patch, PHOENIX-2931.patch > > > Users had complaints on running commands like > {code} > phoenix-sqlline > pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure > service-logs.sql > {code} > However the zookeeper quorum and the port are available in hbase configs. > Phoenix should read these configs from the system instead of having the user > supply them every time. > What we can do is to introduce a keyword "default". If it is specified, > default zookeeper quorum and port will be taken from hbase configs. > Otherwise, users can specify their own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf
[ https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342834#comment-15342834 ] Enis Soztutar commented on PHOENIX-2931: bq. The intention of this print statement was when users did not provide connection string in the command line, we could see it was getting from default explicitly. Of cause this info can be found from connection starting up print. You cannot have System.out.println() as a debug statement. Please remove. bq. jdbc:phoenix:null came from psql command line if we did not provide the connection string. jdbc:phoenix;test=true came from PhoenixEmbeddedDriverTest. You cannot have production code having test-related code like this. We should not pass "null" as the connection string, it should be empty string. bq. This part of code is used by psql.py. If we did not provide connection string in the command line, the first arg would be a file. There is no guarantee the first one is a connection string. We already check whether it is a file or not above, no? The suggestion simplifies the logic for handling the case where arg does not end with .csv or .sql. > Phoenix client asks users to provide configs in cli that are present on the > machine in hbase conf > - > > Key: PHOENIX-2931 > URL: https://issues.apache.org/jira/browse/PHOENIX-2931 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Fix For: 4.8.0 > > Attachments: PHOENIX-2931-v1.patch, PHOENIX-2931-v2.patch, > PHOENIX-2931.patch > > > Users had complaints on running commands like > {code} > phoenix-sqlline > pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure > service-logs.sql > {code} > However the zookeeper quorum and the port are available in hbase configs. > Phoenix should read these configs from the system instead of having the user > supply them every time. > What we can do is to introduce a keyword "default". If it is specified, > default zookeeper quorum and port will be taken from hbase configs. > Otherwise, users can specify their own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf
[ https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342729#comment-15342729 ] Enis Soztutar commented on PHOENIX-2931: Please remove System.out.println() statements. Why {{jdbc:phoenix:null}} and {{jdbc:phoenix;test=true}}? getDefaultConnectionString() does not seem to belong in HBaseFactoryProvider. For below: {code} +execCmd.connectionString = arg; +j = i; .. +if (j > 0) { +usageError("Connection string to HBase must be supplied before input files", options); } {code} You can just do a much simpler thing: {code} if (i ==0) { execCmd.connectionString = arg; } else { usageError("Don't know how to interpret argument '" + arg + "'", options); } Tests work with the changes? PhoenixEmbeddedDriverTest.testNegativeGetConnectionInfo() needs to be changed? > Phoenix client asks users to provide configs in cli that are present on the > machine in hbase conf > - > > Key: PHOENIX-2931 > URL: https://issues.apache.org/jira/browse/PHOENIX-2931 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Fix For: 4.8.0 > > Attachments: PHOENIX-2931-v1.patch, PHOENIX-2931-v2.patch, > PHOENIX-2931.patch > > > Users had complaints on running commands like > {code} > phoenix-sqlline > pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure > service-logs.sql > {code} > However the zookeeper quorum and the port are available in hbase configs. > Phoenix should read these configs from the system instead of having the user > supply them every time. > What we can do is to introduce a keyword "default". If it is specified, > default zookeeper quorum and port will be taken from hbase configs. > Otherwise, users can specify their own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf
[ https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342729#comment-15342729 ] Enis Soztutar edited comment on PHOENIX-2931 at 6/21/16 9:18 PM: - Please remove System.out.println() statements. Why {{jdbc:phoenix:null}} and {{jdbc:phoenix;test=true}}? getDefaultConnectionString() does not seem to belong in HBaseFactoryProvider. For below: {code} +execCmd.connectionString = arg; +j = i; .. +if (j > 0) { +usageError("Connection string to HBase must be supplied before input files", options); } {code} You can just do a much simpler thing: {code} if (i ==0) { execCmd.connectionString = arg; } else { usageError("Don't know how to interpret argument '" + arg + "'", options); } {code} Tests work with the changes? PhoenixEmbeddedDriverTest.testNegativeGetConnectionInfo() needs to be changed? was (Author: enis): Please remove System.out.println() statements. Why {{jdbc:phoenix:null}} and {{jdbc:phoenix;test=true}}? getDefaultConnectionString() does not seem to belong in HBaseFactoryProvider. For below: {code} +execCmd.connectionString = arg; +j = i; .. +if (j > 0) { +usageError("Connection string to HBase must be supplied before input files", options); } {code} You can just do a much simpler thing: {code} if (i ==0) { execCmd.connectionString = arg; } else { usageError("Don't know how to interpret argument '" + arg + "'", options); } Tests work with the changes? PhoenixEmbeddedDriverTest.testNegativeGetConnectionInfo() needs to be changed? > Phoenix client asks users to provide configs in cli that are present on the > machine in hbase conf > - > > Key: PHOENIX-2931 > URL: https://issues.apache.org/jira/browse/PHOENIX-2931 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Fix For: 4.8.0 > > Attachments: PHOENIX-2931-v1.patch, PHOENIX-2931-v2.patch, > PHOENIX-2931.patch > > > Users had complaints on running commands like > {code} > phoenix-sqlline > pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure > service-logs.sql > {code} > However the zookeeper quorum and the port are available in hbase configs. > Phoenix should read these configs from the system instead of having the user > supply them every time. > What we can do is to introduce a keyword "default". If it is specified, > default zookeeper quorum and port will be taken from hbase configs. > Otherwise, users can specify their own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3009) Estimated Size for PTable is ~20% greater than actual size
[ https://issues.apache.org/jira/browse/PHOENIX-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340919#comment-15340919 ] Enis Soztutar commented on PHOENIX-3009: Did you see HBASE-15950? CompressedOops makes a huge difference for estimated pointer sizes. > Estimated Size for PTable is ~20% greater than actual size > -- > > Key: PHOENIX-3009 > URL: https://issues.apache.org/jira/browse/PHOENIX-3009 > Project: Phoenix > Issue Type: Improvement >Reporter: Mujtaba Chohan > Attachments: PHOENIX-3009.patch, ptable_retained.png > > > {{PTable.estimatedSize}} returns size that is around 20% higher than actual > PTable size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332785#comment-15332785 ] Enis Soztutar commented on PHOENIX-2535: Great work!. bq. Enis Soztutar Interesting question. Do we want to publish fat shaded jars for client or I can just skip the install phase for client and server. I don't remember for sure, but we were discussing that it would be good to publish the artifact for full client? I think it will help. Some libraries are publishing fat-jars into maven which is very convenient for users. I think we should definitely do that. Let's open a follow up issue. > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch, > PHOENIX-2535-6.patch, PHOENIX-2535-7.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323578#comment-15323578 ] Enis Soztutar commented on PHOENIX-2892: Thanks [~jamestaylor] for looking. > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213 (Compiled frame) > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59, line=2114 > (Compiled frame) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled > frame) > - > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) > @bci=54, line=130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > The way MVCC works is that it assumes that transactions are short living, and > it guarantees that transactions are committed in strict serial order. > Transactions in this case are write requests coming in and being executed > from handlers. Each handler will start a transaction, get a mvcc write index > (which is the mvcc trx number) and does the WAL append + memstore append. > Then it marks the mvcc trx to be
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322160#comment-15322160 ] Enis Soztutar commented on PHOENIX-2892: Did the test again with 3M inserts with YCSB, this time with Connection.autoCommit(): https://github.com/enis/YCSB/commit/6bb5fb8e84ab637cd8768270f1fc895debc60b6d. Each region flushes a couple of times to make sure that the block cache is used. I've tested pure inserts, not updates. Results are even better with batch size of 1K (75% better throughput) : Without patch: {code} (batch=1) [OVERALL], RunTime(ms), 833913.0 [OVERALL], Throughput(ops/sec), 3597.497580682877 (batch=10) [OVERALL], RunTime(ms), 724676.0 [OVERALL], Throughput(ops/sec), 4139.781088376047 (batch=100) [OVERALL], RunTime(ms), 334914.0 [OVERALL], Throughput(ops/sec), 8957.523423923754 (batch=1000) [OVERALL], RunTime(ms), 215280.0 [OVERALL], Throughput(ops/sec), 13935.340022296545 {code} With patch: {code} (batch=1) [OVERALL], RunTime(ms), 742610.0 [OVERALL], Throughput(ops/sec), 4039.8055506928267 (batch=10) [OVERALL], RunTime(ms), 638678.0 [OVERALL], Throughput(ops/sec), 4697.202659242998 (batch=100) [OVERALL], RunTime(ms), 230182.0 [OVERALL], Throughput(ops/sec), 13033.165060691106 (batch=1000) [OVERALL], RunTime(ms), 122742.0 [OVERALL], Throughput(ops/sec), 24441.511463068877 {code} > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321647#comment-15321647 ] Enis Soztutar commented on PHOENIX-2892: Sure, let me test it today, otherwise I'll revet. I can hack YCSB to do the autoCommit(false) way. > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213 (Compiled frame) > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59, line=2114 > (Compiled frame) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled > frame) > - > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) > @bci=54, line=130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > The way MVCC works is that it assumes that transactions are short living, and > it guarantees that transactions are committed in strict serial order. > Transactions in this case are write requests coming in and being executed > from handlers. Each handler will start a transaction, get a mvcc write index > (which is the mvcc trx number) and does the
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321607#comment-15321607 ] Enis Soztutar commented on PHOENIX-2535: I did a {{mvn clean install -DskipTests}} , which failed with the patch: {code} [INFO] Replacing /Users/enis/projects/git-repos/phoenix/phoenix-client/target/phoenix-4.8.0-HBase-1.2-SNAPSHOT-client.jar with /Users/enis/projects/git-repos/phoenix/phoenix-client/target/phoenix-client-4.8.0-HBase-1.2-SNAPSHOT-shaded.jar [INFO] [INFO] --- maven-install-plugin:2.5.1:install (default-install) @ phoenix-client --- [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Phoenix SUCCESS [2.044s] [INFO] Phoenix Core .. SUCCESS [37.499s] [INFO] Phoenix - Flume ... SUCCESS [1.686s] [INFO] Phoenix - Pig . SUCCESS [1.081s] [INFO] Phoenix Query Server Client ... SUCCESS [5.403s] [INFO] Phoenix Query Server .. SUCCESS [3.574s] [INFO] Phoenix - Pherf ... SUCCESS [3.717s] [INFO] Phoenix - Spark ... SUCCESS [17.787s] [INFO] Phoenix - Hive SUCCESS [13.561s] [INFO] Phoenix Client FAILURE [49.333s] [INFO] Phoenix Server SKIPPED [INFO] Phoenix Assembly .. SKIPPED [INFO] Phoenix - Tracing Web Application . SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 2:15.989s [INFO] Finished at: Wed Jun 08 15:46:32 PDT 2016 [INFO] Final Memory: 119M/663M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) on project phoenix-client: The packaging for this project did not assign a file to the build artifact -> [Help 1] {code} > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch, > PHOENIX-2535-6.patch, PHOENIX-2535-7.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321493#comment-15321493 ] Enis Soztutar commented on PHOENIX-2892: bq. In Phoenix, the batching you're doing will still execute a commit after every row is upserted (you could argue that it shouldn't and it wouldn't be hard to change, but that's the way it works today). Hmm, I made the changes after looking at CALCITE-1128. I think we should fix that in Phoenix then. Isn't that the "correct" way to do batching in JDBC? setAutoCommit(false) should be more for transactions I believe. YCSB client already can do setAutoCommit(), but it does not call commit at all (even in connection close). > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213 (Compiled frame) > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59, line=2114 > (Compiled frame) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled > frame) > - > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) > @bci=54, line=130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 > (Interpreted frame)
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321260#comment-15321260 ] Enis Soztutar commented on PHOENIX-2892: Yep, you need the other commit on top of this so that we use UPSERT rather than INSERT in the YCSB JDBC client. BTW, [~busbey] do you mind reviewing the PR https://github.com/brianfrankcooper/YCSB/pull/755? > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213 (Compiled frame) > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59, line=2114 > (Compiled frame) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled > frame) > - > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) > @bci=54, line=130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > The way MVCC works is that it assumes that transactions are short living, and > it guarantees that transactions are committed in strict serial order. > Transactions in this case are write requests coming in and being executed > from handlers.
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319901#comment-15319901 ] Enis Soztutar commented on PHOENIX-2892: I've added batching in https://github.com/enis/YCSB, and the improvement still stands with batching. However, I don't think I have the numbers handy. Let's commit this one for 4.8. > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213 (Compiled frame) > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59, line=2114 > (Compiled frame) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled > frame) > - > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) > @bci=54, line=130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > The way MVCC works is that it assumes that transactions are short living, and > it guarantees that transactions are committed in strict serial order. > Transactions in this case are write requests coming in and being executed > from handlers. Each handler will start a
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310908#comment-15310908 ] Enis Soztutar commented on PHOENIX-2535: [~sergey.soldatov] did you see my comment here: https://issues.apache.org/jira/browse/PHOENIX-2535?focusedCommentId=15253006=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15253006? > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch, > PHOENIX-2535-6.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2931) Phoenix client asks users to provide configs in cli that are present on the machine in hbase conf
[ https://issues.apache.org/jira/browse/PHOENIX-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309004#comment-15309004 ] Enis Soztutar commented on PHOENIX-2931: Thinking about this, there is no surefire way to check the first argument to see whether it is the connection string. If it contains ":", we can assume it is a connection string, otherwise, we can just check whether it is a file by checking file exists. Or we can create a smaller compatibility issue, and only assume a connection string if it contains ":", otherwise we just assume it is a file. This will break in cases, where only the hostname is provided without the port or zk-root node, but it maybe fine for Phoenix-5.0. This is a very critical improvement in user-friendliness, I think we should have this one way or the other. > Phoenix client asks users to provide configs in cli that are present on the > machine in hbase conf > - > > Key: PHOENIX-2931 > URL: https://issues.apache.org/jira/browse/PHOENIX-2931 > Project: Phoenix > Issue Type: Bug >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu >Priority: Minor > Attachments: PHOENIX-2931.patch > > > Users had complaints on running commands like > {code} > phoenix-sqlline > pre-prod-poc-2.novalocal,pre-prod-poc-10.novalocal,pre-prod-poc-1.novalocal:/hbase-unsecure > service-logs.sql > {code} > However the zookeeper quorum and the port are available in hbase configs. > Phoenix should read these configs from the system instead of having the user > supply them every time. > What we can do is to introduce a keyword "default". If it is specified, > default zookeeper quorum and port will be taken from hbase configs. > Otherwise, users can specify their own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294428#comment-15294428 ] Enis Soztutar commented on PHOENIX-2892: Thanks [~jesse_yates], [~jamestaylor]. I was curious about whether this will be a regression in perf or not. I was able to get a YCSB run with the YCSB jdbc client with small changes on a single node phoenix enabled regionserver. A single client running with 100 threads, I've run the test to write 5M rows of 1KB size, with a index, the same as the primary key, and covering 1 column out of 10. I've artificially limited the block cache to be ~1.3GB to see whether bringing the blocks to the BC will have any affect. Table is created with: {code} DROP TABLE IF EXISTS USERTABLE; CREATE TABLE USERTABLE ( YCSB_KEY VARCHAR(255) PRIMARY KEY, FIELD0 VARCHAR, FIELD1 VARCHAR, FIELD2 VARCHAR, FIELD3 VARCHAR, FIELD4 VARCHAR, FIELD5 VARCHAR, FIELD6 VARCHAR, FIELD7 VARCHAR, FIELD8 VARCHAR, FIELD9 VARCHAR ) SALT_BUCKETS=20, COMPRESSION='NONE' ; CREATE INDEX USERTABLE_IDX ON USERTABLE (YCSB_KEY) INCLUDE (FIELD0) SALT_BUCKETS=20; {code} Here are the rough numbers (first is load, second is run step): {code} WITHOUT PATCH: [OVERALL], RunTime(ms), 1617487.0 [OVERALL], Throughput(ops/sec), 3091.2149525776713 [INSERT], Operations, 500.0 [INSERT], AverageLatency(us), 32250.9799648 [INSERT], MinLatency(us), 4356.0 [INSERT], MaxLatency(us), 2988031.0 [INSERT], 95thPercentileLatency(us), 64127.0 [INSERT], 99thPercentileLatency(us), 103423.0 [INSERT], Return=OK, 500 [OVERALL], RunTime(ms), 476698.0 [OVERALL], Throughput(ops/sec), 2097.764202912536 [INSERT], Operations, 100.0 [INSERT], AverageLatency(us), 47000.933524 [INSERT], MinLatency(us), 3892.0 [INSERT], MaxLatency(us), 927231.0 [INSERT], 95thPercentileLatency(us), 91711.0 [INSERT], 99thPercentileLatency(us), 193151.0 [INSERT], Return=OK, 100 {code} and {code} [OVERALL], RunTime(ms), 1318665.0 [OVERALL], Throughput(ops/sec), 3791.713589122332 [INSERT], Operations, 500.0 [INSERT], AverageLatency(us), 26281.5258828 [INSERT], MinLatency(us), 3744.0 [INSERT], MaxLatency(us), 3966975.0 [INSERT], 95thPercentileLatency(us), 35199.0 [INSERT], 99thPercentileLatency(us), 75647.0 [INSERT], Return=OK, 500 [OVERALL], RunTime(ms), 267711.0 [OVERALL], Throughput(ops/sec), 3735.3713519429534 [INSERT], Operations, 100.0 [INSERT], AverageLatency(us), 26218.333712 [INSERT], MinLatency(us), 3526.0 [INSERT], MaxLatency(us), 749567.0 [INSERT], 95thPercentileLatency(us), 34687.0 [INSERT], 99thPercentileLatency(us), 75455.0 [INSERT], Return=OK, 100 {code} Overall, the load is actually 20% or so faster with the patch. The YCSB JDBC client does not do any batching right now, and autocomit=false is not working since it is not calling commit() at all. If time permits, I'll try to get a run with some batching as well. > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - >
[jira] [Commented] (PHOENIX-542) Batch size calculations should take into account secondary indexes
[ https://issues.apache.org/jira/browse/PHOENIX-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288025#comment-15288025 ] Enis Soztutar commented on PHOENIX-542: --- In HTable / BufferedMutatorImpl we already do the byte-based batching . It makes sense to do byte based in Phoenix as well. > Batch size calculations should take into account secondary indexes > -- > > Key: PHOENIX-542 > URL: https://issues.apache.org/jira/browse/PHOENIX-542 > Project: Phoenix > Issue Type: Task >Affects Versions: 3.0-Release >Reporter: mujtaba > Labels: enhancement > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285859#comment-15285859 ] Enis Soztutar commented on PHOENIX-2892: bq. The initial scan makes the overall operation faster, not slower. If it doesn't anymore, we should remove it. How did you guys initially test it? I can try to replicate the setup with and without the patch if it is documented somewhere. I can understand why skip scan can be faster than a batch of gets. But in this case we are doing a skip scan before a batch of gets not instead of. So I think it should be different than the test done in the thread http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/34697. > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213 (Compiled frame) > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59, line=2114 > (Compiled frame) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled > frame) > - > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) > @bci=54, line=130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 > (Interpreted frame) > -
[jira] [Updated] (PHOENIX-2905) hadoop-2.5.1 artifacts are in the dependency tree
[ https://issues.apache.org/jira/browse/PHOENIX-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-2905: --- Attachment: phoenix-2905_v1.patch This should do the trick. [~sergey.soldatov] FYI. > hadoop-2.5.1 artifacts are in the dependency tree > -- > > Key: PHOENIX-2905 > URL: https://issues.apache.org/jira/browse/PHOENIX-2905 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: dep, phoenix-2905_v1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2905) hadoop-2.5.1 artifacts are in the dependency tree
[ https://issues.apache.org/jira/browse/PHOENIX-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-2905: --- Attachment: dep This is causing problems in eclipse since all UT runs fail with NoClassDefFoundException due to non-binary compatible 2.5.1 and 2.7.1 hadoop jars. Attaching output from {{mvn dependency:tree}} > hadoop-2.5.1 artifacts are in the dependency tree > -- > > Key: PHOENIX-2905 > URL: https://issues.apache.org/jira/browse/PHOENIX-2905 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: dep > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-2905) hadoop-2.5.1 artifacts are in the dependency tree
Enis Soztutar created PHOENIX-2905: -- Summary: hadoop-2.5.1 artifacts are in the dependency tree Key: PHOENIX-2905 URL: https://issues.apache.org/jira/browse/PHOENIX-2905 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 4.8.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285753#comment-15285753 ] Enis Soztutar edited comment on PHOENIX-2892 at 5/17/16 12:51 AM: -- bq. One thing to try would be to lower this and confirm that the batch size client is using for multiple UPSERT VALUES is small as well. Makes sense. I think in this cluster, we'll run with a lower batch size than default. Due to the way the hregion internals work, having a very long running mvcc transaction will just cause further delays though. Maybe we should go with 100 or so as the default limit size for updates with index. bq. Either way, you're going to have the locks due to the processing of the batch mutation for the same amount of time, no? I imagine the time will reduce since at least we would not be doing the initial scan. The other scans we have to do in ay case. BTW, does the correctness of the non-transactional 2ndary index updates depend on row lock being exclusive? In HBase-2.0, the row locks are RW, and doMiniBatchMutation() acquires read-only locks. was (Author: enis): bq. One thing to try would be to lower this and confirm that the batch size client is using for multiple UPSERT VALUES is small as well. Makes sense. I think in this cluster, we'll run with a lower batch size than default. Due to the way the hregion internals work, having a very long running mvcc transaction will just cause further delays though. Maybe we should go with 100 or so as the default limit size for updates with index. bq. Either way, you're going to have the locks due to the processing of the batch mutation for the same amount of time, no? I imagine the time will reduce since at least we would not be doing the initial scan. The other scans we have to do in ay case. BTW, does the correctness of the 2ndary index updates depend on row lock being exclusive? In HBase-2.0, the row locks are RW, and doMiniBatchMutation() acquires read-only locks. > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, >
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285753#comment-15285753 ] Enis Soztutar commented on PHOENIX-2892: bq. One thing to try would be to lower this and confirm that the batch size client is using for multiple UPSERT VALUES is small as well. Makes sense. I think in this cluster, we'll run with a lower batch size than default. Due to the way the hregion internals work, having a very long running mvcc transaction will just cause further delays though. Maybe we should go with 100 or so as the default limit size for updates with index. bq. Either way, you're going to have the locks due to the processing of the batch mutation for the same amount of time, no? I imagine the time will reduce since at least we would not be doing the initial scan. The other scans we have to do in ay case. BTW, does the correctness of the 2ndary index updates depend on row lock being exclusive? In HBase-2.0, the row locks are RW, and doMiniBatchMutation() acquires read-only locks. > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213 (Compiled frame) > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59,
[jira] [Commented] (PHOENIX-2897) Some ITs are not run
[ https://issues.apache.org/jira/browse/PHOENIX-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283019#comment-15283019 ] Enis Soztutar commented on PHOENIX-2897: This is the the diff between what is run versus the classes with IT and Test in their name (excluding Base and Abstract classes): {code} < ClientManagedTimeTest 67d46 < ConnectionQueryServicesTestImpl 74d52 < CoveredIndexCodecForTesting 93d70 < DropIndexDuringUpsertIT 100d76 < End2EndTestDriver 121,122d96 128d101 < HiveTestUtil 135d107 < IndexHandlerIT 140,141d111 < IndexTestUtil < IndexTestingUtils 172d141 < MinVersionTestRunner 184d152 < NeedsOwnMiniClusterTest 225d192 < PhoenixTestDriver 253d219 < QueryServicesTestImpl 263d228 < ReadWriteKeyValuesWithCodecIT 272d236 < ReserveNSequenceTestIT 275,277d238 286a248 > RowKeyOrderedAggregateResultIteratorTest 336d297 373d333 < TestPhoenixIndexRpcSchedulerFactory 377d336 < TestUtil 385d343 < TracingTestUtil {code} A couple of things: - DropIndexDuringUpsertIT is an abstract test with no implementation that extends it. It has a single test that is excluded. - IndexHandlerIT seems it is testing the RPC priority. It fails with NPE. - ReadWriteKeyValuesWithCodecIT and ReserveNSequenceTestIT - Two classes doing the same thing: IndexTestUtil and IndexTestingUtils. - We should use {{-DfailIfNoTests=false}}, otherwise we cannot run a subset of tests from modules. - In HBase, we have a unit test to cover this exact case, where a new test comes in without a Category annotation, we fail the unit test. See TestCheckTestClasses. > Some ITs are not run > - > > Key: PHOENIX-2897 > URL: https://issues.apache.org/jira/browse/PHOENIX-2897 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar >Priority: Critical > Fix For: 4.8.0 > > > I've noticed that some of the IT tests are not run from the mvn verify > command. These are tests that are not marked with an explicit {{@Category}} > or does not extend the base test classes. > Some example ones are: > {code} > IndexHandlerIT > ReadWriteKeyValuesWithCodecIT > {code} > See the lack of these tests in > https://builds.apache.org/view/All/job/phoenix-master/1223/consoleFull -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-2897) Some ITs are not run
Enis Soztutar created PHOENIX-2897: -- Summary: Some ITs are not run Key: PHOENIX-2897 URL: https://issues.apache.org/jira/browse/PHOENIX-2897 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical I've noticed that some of the IT tests are not run from the mvn verify command. These are tests that are not marked with an explicit {{@Category}} or does not extend the base test classes. Some example ones are: {code} IndexHandlerIT ReadWriteKeyValuesWithCodecIT {code} See the lack of these tests in https://builds.apache.org/view/All/job/phoenix-master/1223/consoleFull -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2897) Some ITs are not run
[ https://issues.apache.org/jira/browse/PHOENIX-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-2897: --- Fix Version/s: 4.8.0 > Some ITs are not run > - > > Key: PHOENIX-2897 > URL: https://issues.apache.org/jira/browse/PHOENIX-2897 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar >Priority: Critical > Fix For: 4.8.0 > > > I've noticed that some of the IT tests are not run from the mvn verify > command. These are tests that are not marked with an explicit {{@Category}} > or does not extend the base test classes. > Some example ones are: > {code} > IndexHandlerIT > ReadWriteKeyValuesWithCodecIT > {code} > See the lack of these tests in > https://builds.apache.org/view/All/job/phoenix-master/1223/consoleFull -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281077#comment-15281077 ] Enis Soztutar commented on PHOENIX-2892: Thanks James. bq. Have you tried lower phoenix.mutate.batchSize or decreasing the number of rows being upserted before a commit is done by the client? I believe the cluster was running with {{phoenix.mutate.batchSize=1000}} when this happened initially. With the hbase's asyncprocess grouping the edits by regionserver, there can be even more or less in the incoming {{multi()}} call. bq. Functionally, removing the pre-warming won't have an impact, so if it prevents this situation, then feel free to remove it. However, if the skip scan is taking long, the cumulative time of doing the individual scans on each row will take even longer, so you may just be kicking the can down the road. We are doing the gets in parallel (I believe 10 threads by default), while the skip scan is still a serial scanner, no? Plus, we are doing double work. Even if the results are coming from the block cache, the scan overhead is not negligible. I am surprised that the secondary index performance was improved. I'm running the tests with {{mvn verify -Dtest=*Index*}}, but some tests are failing out of the box without the patch for me. Maybe I have a setup issue. [~chrajeshbab...@gmail.com] any idea? > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController,
[jira] [Updated] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
[ https://issues.apache.org/jira/browse/PHOENIX-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-2892: --- Attachment: phoenix-2892_v1.patch [~jamestaylor] do you have a background on why we are doing a scan for pre-warming the block cache? Is it worth doing double work? Attaching simple patch. Running the index tests with it. > Scan for pre-warming the block cache for 2ndary index should be removed > --- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. The > problem is that all handlers for doing writes were blocked waiting on a > single scan from the secondary index to complete for > 5mins, thus causing > all incoming RPCs to timeout and causing write un-availability and further > problems (disabling the index, etc). We've taken jstack outputs continuously > from the servers to understand what is going on. > In the jstack outputs from that particular server, we can see three types of > stacks (this is raw jstack so the thread names are not there unfortunately). > - First, there are a lot of threads waiting for the MVCC transactions > started previously: > {code} > Thread 15292: (state = BLOCKED) > - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be > imprecise) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) > @bci=86, line=253 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, > org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled > frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=1906, line=3187 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=79, line=2819 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2761 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, > org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, > org.apache.hadoop.hbase.CellScanner, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > java.util.List, long) @bci=547, line=654 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, > line=32213 (Compiled frame) > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59, line=2114 > (Compiled frame) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled > frame) > - > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) > @bci=54, line=130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > The way MVCC works is that it assumes that transactions are short living, and > it guarantees that transactions are committed in strict serial order. > Transactions in this case are write requests coming in and being executed > from handlers. Each handler will
[jira] [Created] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed
Enis Soztutar created PHOENIX-2892: -- Summary: Scan for pre-warming the block cache for 2ndary index should be removed Key: PHOENIX-2892 URL: https://issues.apache.org/jira/browse/PHOENIX-2892 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 4.8.0 We have run into an issue in a mid-sized cluster with secondary indexes. The problem is that all handlers for doing writes were blocked waiting on a single scan from the secondary index to complete for > 5mins, thus causing all incoming RPCs to timeout and causing write un-availability and further problems (disabling the index, etc). We've taken jstack outputs continuously from the servers to understand what is going on. In the jstack outputs from that particular server, we can see three types of stacks (this is raw jstack so the thread names are not there unfortunately). - First, there are a lot of threads waiting for the MVCC transactions started previously: {code} Thread 15292: (state = BLOCKED) - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise) - org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry) @bci=86, line=253 (Compiled frame) - org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.completeMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl$WriteEntry, org.apache.hadoop.hbase.regionserver.SequenceId) @bci=29, line=135 (Compiled frame) - org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) @bci=1906, line=3187 (Compiled frame) - org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) @bci=79, line=2819 (Compiled frame) - org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], long, long) @bci=12, line=2761 (Compiled frame) - org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, org.apache.hadoop.hbase.regionserver.Region, org.apache.hadoop.hbase.quotas.OperationQuota, java.util.List, org.apache.hadoop.hbase.CellScanner) @bci=150, line=692 (Compiled frame) - org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, org.apache.hadoop.hbase.quotas.OperationQuota, org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionAction, org.apache.hadoop.hbase.CellScanner, org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, java.util.List, long) @bci=547, line=654 (Compiled frame) - org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) @bci=407, line=2032 (Compiled frame) - org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=167, line=32213 (Compiled frame) - org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, com.google.protobuf.Descriptors$MethodDescriptor, com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, org.apache.hadoop.hbase.monitoring.MonitoredRPCHandler) @bci=59, line=2114 (Compiled frame) - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled frame) - org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) @bci=54, line=130 (Compiled frame) - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) {code} The way MVCC works is that it assumes that transactions are short living, and it guarantees that transactions are committed in strict serial order. Transactions in this case are write requests coming in and being executed from handlers. Each handler will start a transaction, get a mvcc write index (which is the mvcc trx number) and does the WAL append + memstore append. Then it marks the mvcc trx to be complete, and before returning to the user, we have to guarantee that the transaction is visible. So we wait for the mvcc read point to be advanced beyond our own write trx number. This is done at the above stack trace (waitForPreviousTransactionsComplete). A lot of threads with this stack means that one or more handlers have started a mvcc transaction, but did not finish the work and thus did not complete their transactions. MVCC read point can
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253006#comment-15253006 ] Enis Soztutar commented on PHOENIX-2535: It seems that we are ending with two jars now, one is coming from the main phoenix-client pom, and the other from the phoenix-assembly. under {{phoenix-client/target/}} {code} 89M -rw-r--r-- 1 enis staff 89M Apr 21 16:30 phoenix-4.8.0-HBase-1.1-SNAPSHOT-client.jar 87M -rw-r--r-- 1 enis staff 87M Apr 21 16:28 phoenix-client-4.8.0-HBase-1.1-SNAPSHOT.jar {code} I am not sure whether we actually still need the assembly to contain {{phoenix-assembly/src/build/components/all-common-jars.xml}} etc. It seems that if we are building the jars from the modules, the assembly's job can be just simply to gather these into the tarball. > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252989#comment-15252989 ] Enis Soztutar commented on PHOENIX-2535: Thanks [~sergey.soldatov]. Let me take a look. [~elserj] you want to do that as well. > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (PHOENIX-2843) Phoenix-site does not build with JDK-8
[ https://issues.apache.org/jira/browse/PHOENIX-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar resolved PHOENIX-2843. Resolution: Fixed > Phoenix-site does not build with JDK-8 > -- > > Key: PHOENIX-2843 > URL: https://issues.apache.org/jira/browse/PHOENIX-2843 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: PHOENIX-2843_v1.patch > > > Trying to build the phoenix-site with JDK-8 fails with the following: > {code} > HW10676:phoenix-docs$ ./build.sh docs > src/tools/org/h2/build/BuildBase.java:136: error: no suitable method found > for replaceAll(String,String,String) > pattern = replaceAll(pattern, "/", File.separator); > ^ > method List.replaceAll(UnaryOperator) is not applicable > (actual and formal argument lists differ in length) > method ArrayList.replaceAll(UnaryOperator) is not applicable > (actual and formal argument lists differ in length) > 1 error > Error: Could not find or load main class org.h2.build.Build > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2843) Phoenix-site does not build with JDK-8
[ https://issues.apache.org/jira/browse/PHOENIX-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250291#comment-15250291 ] Enis Soztutar commented on PHOENIX-2843: Thanks Nick. bq. What's org.h2? Should this be pushed upstream? Seems that our docs structure has been copy-pasted from H2 database: https://searchcode.com/codesearch/view/19935917/. > Phoenix-site does not build with JDK-8 > -- > > Key: PHOENIX-2843 > URL: https://issues.apache.org/jira/browse/PHOENIX-2843 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: PHOENIX-2843_v1.patch > > > Trying to build the phoenix-site with JDK-8 fails with the following: > {code} > HW10676:phoenix-docs$ ./build.sh docs > src/tools/org/h2/build/BuildBase.java:136: error: no suitable method found > for replaceAll(String,String,String) > pattern = replaceAll(pattern, "/", File.separator); > ^ > method List.replaceAll(UnaryOperator) is not applicable > (actual and formal argument lists differ in length) > method ArrayList.replaceAll(UnaryOperator) is not applicable > (actual and formal argument lists differ in length) > 1 error > Error: Could not find or load main class org.h2.build.Build > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2843) Phoenix-site does not build with JDK-8
[ https://issues.apache.org/jira/browse/PHOENIX-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated PHOENIX-2843: --- Attachment: PHOENIX-2843_v1.patch One liner patch. This is against the svn repository for phoenix-site. Tested that it works as intended. > Phoenix-site does not build with JDK-8 > -- > > Key: PHOENIX-2843 > URL: https://issues.apache.org/jira/browse/PHOENIX-2843 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: PHOENIX-2843_v1.patch > > > Trying to build the phoenix-site with JDK-8 fails with the following: > {code} > HW10676:phoenix-docs$ ./build.sh docs > src/tools/org/h2/build/BuildBase.java:136: error: no suitable method found > for replaceAll(String,String,String) > pattern = replaceAll(pattern, "/", File.separator); > ^ > method List.replaceAll(UnaryOperator) is not applicable > (actual and formal argument lists differ in length) > method ArrayList.replaceAll(UnaryOperator) is not applicable > (actual and formal argument lists differ in length) > 1 error > Error: Could not find or load main class org.h2.build.Build > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-2843) Phoenix-site does not build with JDK-8
Enis Soztutar created PHOENIX-2843: -- Summary: Phoenix-site does not build with JDK-8 Key: PHOENIX-2843 URL: https://issues.apache.org/jira/browse/PHOENIX-2843 Project: Phoenix Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 4.8.0 Trying to build the phoenix-site with JDK-8 fails with the following: {code} HW10676:phoenix-docs$ ./build.sh docs src/tools/org/h2/build/BuildBase.java:136: error: no suitable method found for replaceAll(String,String,String) pattern = replaceAll(pattern, "/", File.separator); ^ method List.replaceAll(UnaryOperator) is not applicable (actual and formal argument lists differ in length) method ArrayList.replaceAll(UnaryOperator) is not applicable (actual and formal argument lists differ in length) 1 error Error: Could not find or load main class org.h2.build.Build {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237099#comment-15237099 ] Enis Soztutar commented on PHOENIX-2535: bq. FYI, although we specify Guava 13 in our pom, Tephra works with Guava 12 too, a requirement we had since HBase uses an older version. Guava is one of the worst offenders in terms of breaking binary compatibility, and is generally the reason (among jackson, PB and a few others) that the clients require shading. Not having guava shading will be a half-solution I think. > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237095#comment-15237095 ] Enis Soztutar commented on PHOENIX-2535: bq. I know some of these Enis Soztutar already pointed out (mortbay, specifically), but com.google.common, pig/flume/hadoop/hbase, asm, and jersey worry me Not sure about the HBase dependency, but we can assume that if a client wants to depend on Phoenix, they will also want to depend on HBase. Since Phoenix version and HBase version HAS TO go together, I think the HBase non-shaded dependency is fine. Flume, Pig and (upcoming Hive) should not be shaded, otherwise the integration will not work. For example, phoenix-flume implements Sources and Sinks which are flume classes. Good thing is that these are already different modules, so that regular clients do not have to depend on these modules. > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232934#comment-15232934 ] Enis Soztutar commented on PHOENIX-2535: bq. Possible we need to rename it to phoenix-query-client as well? Name these modules as {{phoenix-queryserver}} and {{phoenix-queryserver-client}}? > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232931#comment-15232931 ] Enis Soztutar commented on PHOENIX-2535: bq. Once the dust settles here, it would be great to re-evaluate publishing client jars to maven central. It's a real PITA to tell folks doing maven dev to drop a jar into their resources manually. Yes, we are going the maven module route rather than the assembly, so that the jars will end up in maven repo. 4.8 RM should especially check whether the publish is going through. We can do a a SNAPSHOT publish in the mean time just to check. bq. I hate to be the bearer of bad news, but these are most likely not meeting ASF licensing requirements. For every bundled jar that Phoenix ships in a shaded jar, we're going to have to HBase's shaded modules do the right stuff I think. We can just copy from there. > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch, PHOENIX-2535-5.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218898#comment-15218898 ] Enis Soztutar commented on PHOENIX-2535: Thanks Sergey for the updated patch. One question I was thinking of was how clients consume the client jars. The client and server jars are not pushed to the maven repository it seems, only the phoenix-core, phoenix-spark, etc. So, if a MR phoenix application or a JDBC application using phoenix depends on phoenix-core, they will not get the shaded jars at all with the patch as it is. Maybe we have to do a different maven module like {{phoenix-shaded-core}} / {{phoenix-shaded-client}} or something like that (similar to hbase-shaded-client). If we are going with shaded client all the time, then maybe just do {{phoenix-client}} without shaded in the name. Do also shading in the {{phoenix-server-client}} module as well? wdyt [~elserj]? Same thing for the phoenix-spark-client. I think that the way we are doing these jars in the assembly is not correct (not due to this patch of course), since they are not getting pushed to the maven repository, but they are just released inside the tarball. We should do a maven module to build the artifact jars (one module for phoenix-client, phoenix-server-client, phoenix-spark-client, etc) so that these jars get released. bq. Only client is shadowing. Server part stays intact. The only difference that for building it shadow plugin is using instead of assemble. Alright, I saw the server using the shading module, but did not check whether there was actual shading. {{org/objectweb}} is still not shaded. We are also now generating empty jars for phoenix-assembly, which are not needed {{phoenix-assembly-4.8.0-HBase-1.1-SNAPSHOT-XXX.jar}}. I skimmed through some of the code in Phoenix to see whether we maybe breaking backwards compatibility accidentally, but it seems that we are not. A second look from another reviewer will also help if possible. > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch, PHOENIX-2535-4.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217350#comment-15217350 ] Enis Soztutar commented on PHOENIX-2535: HBase's shaded poms also contain these: {code} false false true ${project.build.directory}/dependency-reduced-pom.xml false {code} Did you look into those? > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)
[ https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217347#comment-15217347 ] Enis Soztutar commented on PHOENIX-2535: Thanks [~sergey.soldatov] for pushing this. Some comments: - I think you did reformat the whole pom file. It makes it very hard to review the difference that is added in the patch. Can you please undo that. Looking at some pom.xml files, unlike the java code, 2 space indentation seems to be used. - Instead of {code} +com + shaded.phoenix.com {code} I think we should use {{org.apache.phoenix.shaded.com}}. Similar for others. - Did a dir listing for the phoenix shaded client jar: -- {{javax/}}, {{com.sum}} contents seems to be fine. -- {{org/apache/thrift}}, {{org/objectweb/}}, {{org/apache/directory}}, {{org/eclipse/}} we are not shading these? -- org/mortbay hbase shades this as well. Not log4j though. Maybe needed for slf4j? -- {{sqlline}} seems to have classes around. Needed for it to work? -- We have some of these lying around (from some dependencies). Not sure whether they are needed or not: {code} META-INF/maven/org.apache.twill/ META-INF/maven/org.apache.twill/twill-common/ META-INF/maven/org.apache.twill/twill-common/pom.xml META-INF/maven/org.apache.twill/twill-common/pom.properties {code} - HBase does not shade the htrace dependency. I think it maybe relevant (phoenix trace -> hbase trace -> hdfs trace in the same context). - Did you test it with running the server jars with hbase? I fear if HBase has an internal API that exposes something like guava, and then we are using it in Phoenix, it will be a runtime exception I think. One safe option is to not-shade server jars, but shade the client jars if it does not work. - {{apache/commons/csv/}} is this part of the API (sorry did not check). If we allow extending CSVBulkLoad MR job for example. - flume probably should not be shaded. Otherwise the phoenix-flume will not work (phoenix implements flume APIs). - same thing for pig (although I see both shaded and un-shaded pig classes) > Create shaded clients (thin + thick) > - > > Key: PHOENIX-2535 > URL: https://issues.apache.org/jira/browse/PHOENIX-2535 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Sergey Soldatov > Fix For: 4.8.0 > > Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, > PHOENIX-2535-3.patch > > > Having shaded client artifacts helps greatly in minimizing the dependency > conflicts at the run time. We are seeing more of Phoenix JDBC client being > used in Storm topologies and other settings where guava versions become a > problem. > I think we can do a parallel artifact for the thick client with shaded > dependencies and also using shaded hbase. For thin client, maybe shading > should be the default since it is new? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2742) Add new batchmutate APIs in HRegion without mvcc and region level locks
[ https://issues.apache.org/jira/browse/PHOENIX-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184197#comment-15184197 ] Enis Soztutar commented on PHOENIX-2742: bq. I think this is for local index where they will write to a shadow column and that is part of the same table and same region but different Cf? Yes, same region, different column family and also different row for each local index definition. bq. When it is cells to be written to same region (Another CF), we can include new cells into each mutation in the pre hook right? We dont need to do another batchMutate() call. Am I missing any thing? I think so. See my comment from parent jira replicated here: bq. Given the above usage of batchMutate(), I’m assuming that means that local index updates are not transactional with the data updates. I suppose we can do this in a follow up JIRA, but wouldn’t it be possible to use the MultiRowMutationService.mutateRows() call that includes both index and data updates to get transactionality? We could have our Indexer.preBatchMutate() call e.bypass() which is a technique I’ve seen used by Tephra to skip the normal processing. Having the index updates be transactional is one of the most important advantages of local indexing IMHO, so it'd be a shame not to have this. Agreed, that is what I was proposing above: https://issues.apache.org/jira/browse/PHOENIX-1734?focusedCommentId=14986604=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14986604 I was curious to see how would this work with mutateRowsWithLocks(). The problem is that the writes to the data table arrives at the regular {{HRegion.doMiniBatchMutation}} rather than {{mutateRowsWithLocks()}}. However, I was able to write a simple coprocessor test to demonstrate that adding the extra local index mutations coming from the {{IndexCodec}} should be possible in {{preBatchMutate()}}. We can do this: {code} // add all index updates: private Put addUnchecked(Cell kv, Put put) { byte [] family = CellUtil.cloneFamily(kv); NavigableMapfamilyMap = put.getFamilyCellMap(); List list = put.getFamilyCellMap().get(family); if (list == null) { list = new ArrayList(); } list.add(kv); familyMap.put(family, list); return put; } {code} at {{preBatchMutate()}} for every mutation, and it seems to be working: {code} keyvalues={a/info:c1/1447717036828/Put/vlen=1/seqid=0} keyvalues={a1d/L_info:c1/1447717036828/Put/vlen=0/seqid=0} keyvalues={a2c/L_info:c1/1447717036828/Put/vlen=0/seqid=0} keyvalues={a3b/L_info:c1/1447717036828/Put/vlen=0/seqid=0} keyvalues={a4a/L_info:c1/1447717036828/Put/vlen=0/seqid=0} keyvalues={b/info:c1/1447717036828/Put/vlen=1/seqid=0} keyvalues={c/info:c1/1447717036828/Put/vlen=1/seqid=0} keyvalues={d/info:c1/1447717036828/Put/vlen=1/seqid=0} {code} Since this is pre-mutate, the changes should have atomic visibility. The only fishy part is that, Mutation should normally only hold the Cell's for the same row. Here we are adding Cells for the local index that is for different rows. The rowLocks should be fine since locking the original row only should be enough for index updates. I did not do the deep inspection at the doMiniBatchMutation() to see whether there are assumptions that might break. More findings: - In {{doMiniBatchMutation()}}, we call getFamilyCellMap() before {{preBatchMutate()}}. However, {{prePut()}}, {{preDelete()}} happens before even {{doMiniBatchMutation}} is called. {code} Map familyMap = mutation.getFamilyCellMap(); {code} Thus, Indexer should inject at the prePut() level, not {{preBatchMutate()}} level for local indexes if we are following this path. - It seems that other code paths in doMiniBatchMutate() does not need the row key from the Mutation. Only rowLock and {{checkRow()}} needs that. Local index updates in theory always happen to be in the current region by definition, thus checkRow() is not needed (unless there is a bug). Can we have a case where the row key for the index update is the same as a data row key? If that is possible, lacking row locks might cause problems? bq. Sounds like from your experiment, it wouldn't matter which Mutation in miniBatchOp you added the index Mutations too, right? Though if need be, it wouldn't be difficult to add them to the data Mutation they're associated with I think it will be safest to add the index updates to the same row. The doMiniBatchMutation() is not atomic across operations in the batch, and every operation has a different success / exception tracker. > Add new batchmutate APIs in HRegion without mvcc and region level locks > --- > > Key: PHOENIX-2742 > URL: https://issues.apache.org/jira/browse/PHOENIX-2742 >
[jira] [Commented] (PHOENIX-2737) Make sure local indexes work properly after fixing region overlaps by HBCK.
[ https://issues.apache.org/jira/browse/PHOENIX-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178452#comment-15178452 ] Enis Soztutar commented on PHOENIX-2737: HBCK unfortunately has been used extensively in cases where there is a problem as reported in region boundaries. How often we run into such situations? I would say in older 0.98 code bases quite often. Recent 1.1+ is much better. Having to rebuild the index after HBCK run is acceptable for a short term solution, although not desired. The problem is that the severity of the situation if a problem with region split happens, then it will not only cause some downtime for HBase, it will cause downtime until the index rebuild is complete. In case there is a lot of data to scan and lot of local indexes (we have seen people trying 50-60 index columns) this will cause a long downtime. The other thing is that no matter how good we document it etc, users will not remember to do it consistently every time causing data to be not seen by queries silently. We were discussing 2 alternatives with Rajesh. Using separators for startKeys is one, and the other is writing the startKey in every hfile. Since region start keys are derived from actual primary keys, what you are saying if I interpret it right is that we cannot rely on the separator byte as is because it will be in the data itself in case varchar columns etc are used. In this case, we need a separator for the whole start key (a concat of primary keys), not individual columns inside. So we have to use a different escape anyway, no? The second solution I was suggesting is to write the start key of the region into every hfile. This will happen in regular flush / compaction and bulk load paths. If we can do it consistently, then we will automatically know the length of the start key to replace consistently and we do not need to depend on the meta to show us the split points, etc. HBASE-14511 allows to do this, but it is not yet committed. However, if we agree that this is the correct solution, we can even just add this field in HBase proper (don't think it will be controversial to add a single field in hfile meta in HBase). > Make sure local indexes work properly after fixing region overlaps by HBCK. > --- > > Key: PHOENIX-2737 > URL: https://issues.apache.org/jira/browse/PHOENIX-2737 > Project: Phoenix > Issue Type: Bug >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Fix For: 4.8.0 > > > When there are region overlaps hbck fix by moving hfiles of overlap regions > to new region of common key of overlap regions. Then we might not properly > replace region start key in HFiles in that case. In this case we don't have > any relation of parent child region in hbase:meta so we cannot identify the > start key in HFiles. To fix this we need to add separator after region > start key so that we can easily identify start key in HFile without always > touching hbase:meta. So when we create scanners for the Storefiles we can > check the region start key in hfile with region start key and if any change > we can just replace the old start key with current region start key. During > compaction we can properly replace the start key with actual key values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1973) Improve CsvBulkLoadTool performance by moving keyvalue construction from map phase to reduce phase
[ https://issues.apache.org/jira/browse/PHOENIX-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174463#comment-15174463 ] Enis Soztutar commented on PHOENIX-1973: Oh I was not following the discussions there, thanks. > Improve CsvBulkLoadTool performance by moving keyvalue construction from map > phase to reduce phase > -- > > Key: PHOENIX-1973 > URL: https://issues.apache.org/jira/browse/PHOENIX-1973 > Project: Phoenix > Issue Type: Improvement >Reporter: Rajeshbabu Chintaguntla >Assignee: Sergey Soldatov > Fix For: 4.7.0 > > Attachments: PHOENIX-1973-1.patch, PHOENIX-1973-2.patch, > PHOENIX-1973-3.patch, PHOENIX-1973-4.patch, PHOENIX-1973-5.patch, > PHOENIX-1973-6.patch, PHOENIX-1973-7.patch > > > It's similar to HBASE-8768. Only thing is we need to write custom mapper and > reducer in Phoenix. In Map phase we just need to get row key from primary key > columns and write the full text of a line as usual(to ensure sorting). In > reducer we need to get actual key values by running upsert query. > It's basically reduces lot of map output to write to disk and data need to be > transferred through network. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1973) Improve CsvBulkLoadTool performance by moving keyvalue construction from map phase to reduce phase
[ https://issues.apache.org/jira/browse/PHOENIX-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174178#comment-15174178 ] Enis Soztutar commented on PHOENIX-1973: This is already committed, no? Sergey do we need to do an addendum? > Improve CsvBulkLoadTool performance by moving keyvalue construction from map > phase to reduce phase > -- > > Key: PHOENIX-1973 > URL: https://issues.apache.org/jira/browse/PHOENIX-1973 > Project: Phoenix > Issue Type: Improvement >Reporter: Rajeshbabu Chintaguntla >Assignee: Sergey Soldatov > Fix For: 4.7.0 > > Attachments: PHOENIX-1973-1.patch, PHOENIX-1973-2.patch, > PHOENIX-1973-3.patch, PHOENIX-1973-4.patch, PHOENIX-1973-5.patch, > PHOENIX-1973-6.patch, PHOENIX-1973-7.patch > > > It's similar to HBASE-8768. Only thing is we need to write custom mapper and > reducer in Phoenix. In Map phase we just need to get row key from primary key > columns and write the full text of a line as usual(to ensure sorting). In > reducer we need to get actual key values by running upsert query. > It's basically reduces lot of map output to write to disk and data need to be > transferred through network. -- This message was sent by Atlassian JIRA (v6.3.4#6332)