[
https://issues.apache.org/jira/browse/PHOENIX-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350883#comment-14350883
]
Gary Helmling commented on PHOENIX-1674:
----------------------------------------
{quote}
I think those should be implemented first. Going down the path of a custom
delete marker is going to be really ugly. Plus, then we have to support it
going forward (at least until we know that any major compaction ran, I guess
). Do you think it'd be feasible to attempt that? Maybe you and Lars Hofhansl
could collaborate?
{quote}
Tephra already implements a custom delete marker and this should be completely
transparent to Phoenix. What I was describing was simply a write-time
optimization to this to implement column family delete markers in Tephra as
well. And it seems like it might be necessary, depending on the HBase versions
you want to support with this. Even if the development on the HBase changes
was already done, something like adding an "undelete" operation does seem like
it would go into a release prior to 1.1, so there will be some delay in the
release and adoption cycle.
I would definitely like to avoid duplicating HBase functionality, but
availability of the new features is something we need to consider as well.
{quote}
Functionally I think it's the same to have the SkipScanFilter and
TransactionVisibilityFilter in either order, but I think we'd take a perf hit
under some cases if SkipScanFilter isn't first. Phoenix doesn't use the HBase
Get for point lookups, but uses the SkipScanFilter instead. If we're seeking
over a billion rows and only returning a few rows (which would be a few seeks
from the SkipScanFilter), running the TransactionVisibilityFilter over every
row is going to be more expensive than running it over only the rows that pass
through the SkipScanFilter.
{quote}
Looking at the code for FilterList, I'm not sure this is true. It looks to me
like any seek hint returned by SkipScanFilter would still be respected
regardless of ordering. We can examine this in more detail together.
{quote}
Can these [transaction managers] be co-located with RSs (and does that make
sense)? Would it be possible to communicate to them via an EndPoint coprocessor
and even run in the same JVM as the RS? Not that Tephra may want to always work
this way, but for Phoenix I think it'd make sense, as we're already pinging the
RS (as mentioned above) to ensure our metadata is up to date, so there'd be no
extra overhead. It'd also negate any issues with ensuring a separate, new
service is up and running.
{quote}
Yes, you can run multiple stand-by transaction managers. These could be
co-located with region servers, as long as the region servers are not hogging
up all the system resources. You don't want the transaction managers to be
CPU-starved for example. So for that reason, co-locating with master nodes
might be a better option.
> Snapshot isolation transaction support through Tephra
> -----------------------------------------------------
>
> Key: PHOENIX-1674
> URL: https://issues.apache.org/jira/browse/PHOENIX-1674
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: James Taylor
>
> Tephra (http://tephra.io/ and https://github.com/caskdata/tephra) is one
> option for getting transaction support in Phoenix. Let's use this JIRA to
> discuss the way in which this could be integrated along with the pros and
> cons.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)